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The invention features a method for^id endf ying a cDNA nucleicacid encoding a m ammalian protein having a signal sequence, which 
method includes the following steps: a) providirf^librarv"Ot mammalian tcDN^A r^y^igatin g^the libr ary of mammali an cDNA tolDNA 
enccriing«al kaline # phps phataseilackingtDO^ c) transform ing 

bacterial cells with the ligated DNA to create a bacterial cell clone library; d) isolating DNA comprising the mammalian cDN A from at 
least one clone in the bacterial cell clone library; e) separately transfecting DNA isolated from clones in step (d) into mammalian cells 
which do not express alkaline phosphatase to create a mammalian cell clone library wherein each clone in the mammalian cell clone library 
corresponds to a clone in the bacterial cell clone library; f) udentifying*a»clone«in»the>mammalian«celKcloneilibraiymW hich % ex prcss A alkaline 
ph o sp hatase: g) identifying the clone in the bacterial cell clone library corresponding to the clone in the mammalian cell clone library 
identified in step (f); and h) isolating and sequencing a portion of the mammalian cDNA present in the bacterial cell library clone identified 
in step (g) to identify a mammalian cDNA encoding a mammalian protein having a signal sequence. 
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METHOD FOR IDENTIFYING GENES ENCODING NOVEL 
SECRETED OR MEMBRANE-ASSOCIATED PROTEINS 

Background of the Invention 
5 The invention relates to methods for identifying 

genes encoding novel proteins. 

There is considerable medical interest in secreted 
and membrane-associated mammalian proteins. Many such 
proteins, for example, cytokines, are important for 
10 inducing the growth or differentiation of cells with 
which they interact or for triggering one or more 
specific cellular responses. 

An important goal in the design and development of 
new therapies is the identification and characterization 
15 of secreted proteins and the genes which encode them. 

Traditionally, this goal has been pursued by identifying 
a particular response of a particular cell type and 
attempting to isolate and purify a secreted protein 
capable of eliciting the response. This approach is 

2 0 limited by a number of factors. First, certain secreted 

proteins will not be identified because the responses 
they evoke may not be recognizable or measurable. 
Second, because in vitro assays must be used to isolate 
and purify secreted proteins, somewhat artificial systems 
25 must be used. This raises the possibility that certain 
important secreted proteins will not be identified unless 
the features of the in vitro system (e.g., cell line, 
culture medium, or growth conditions) accurately reflect 
the in vivo milieu. Third, the complexity of the effects 

3 0 of secreted proteins on the cells with which they 

interact vastly complicates the task of isolating 
important secreted proteins. Any given cell can be 
simultaneously subject to the effects of two or more 
secreted proteins. Because any two secreted proteins 
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will not have the same effect on a given cell and because 
the effect of a first secreted protein on a given cell 
can alter the effect of a second secreted protein on the 
same cell, it can be difficult to isolate the secreted 
5 protein or proteins responsible for a given physiological 
response. In addition , certain secreted and membrane- 
associated proteins may be expressed at levels that are 
too low to detect by biological assay or protein 
purification. 

10 In another approach, genes encoding secreted 

proteins have been isolated using DNA probes or PCR 
oligonucleotides which recognize sequence motifs present 
in genes encoding known secreted protein. In addition, 
homology-directed searching of Expressed Sequence Tag 

15 (EST) sequences derived by high-throughput sequencing of 
specific cDNA libraries has been used to identify genes 
encoding secreted proteins. These approaches depend for 
their success on a high degree of similarity between the 
DNA sequences used as probes and the unknown genes or EST 

2 0 sequences. 

More recently, methods have been developed that 
permit the identification of cDNAs encoding a signal 
sequence capable of directing the secretion of a 
particular protein from certain cell types. Both Honjo, 
25 U.S. Patent No. 5,525,486, and Jacobs, U.S. Patent No. 

5,536,637, describe such methods. These methods are said 
to be capable of identifying secreted proteins. 

The demonstrated clinical utility of several 
secreted proteins in the treatment of human disease, for 

3 0 example, erythropoietin, granulocyte-macrophage colony 

stimulating factor (GM-CSF) , human growth hormone, and 
various interleukins, has generated considerable interest 
in the identification of novel secreted proteins. The 
method of the invention can be employed as a tool in the 
3 5 discovery of such novel proteins. 
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Summary of the Invention 
The invention features a method for isolating 
cDNAs and identify i7Tg"^W 

a ssoc iated (e.g. transmembrane ) mainma«l*ian«ipi5otrevbns . The 
5 method of the invention relies upon the observation that 
the majority of secreted and membrane-associated proteins 
possess at their ami^fTS^te r mi^iwaw i c 

am i»nomac&d« g^^ • 91 

The signal sequence directs secreted and membrane- 

10 associated proteins to a sub-cellular membrane 

compartment termed the endoplasmic reticulum, from which 
these proteins are dispatched for secretion or 
presentation on the cell surface. 

The invention describes a method in which oDNAs 

15 that encode signal sequences for secreted or membrane- 
associated proteins are isolated by virtue of their 
abilities to direct the export of the reporter protein, 
alkaline phosphatase (AP) , from mammalian cells. The 
present method has major advantages over other signal 

2 0 peptide trapping approaches. The present method is 

highly sensitive. This facilitates the isolation of 
signal peptide associated proteins that may be difficult 
to isolate with other techniques. Moreover, the present 
method is amenable to throughput screening techniques and 
25 automation. Combined with a novel method for cDNA 

library construction in which directional random primed 
cDNA libraries are prepared, the invention comprises a 
powerful and approach to the large scale isolation of 
novel secreted proteins. 

3 0 The invention features a method for identifying a 

*eDNA^nue*l'e^i»e" , S5 r i^ ing a 

signal sequence, which method includes the following 
steps : 

a) providing library of mammalian cDNA; 
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b) ligating the library of mammalian cDNA to DNA 
encoding alkaline phosphatase lacking both a signal 
sequence and a membrane anchor sequence to form ligated 
DNA; 

5 c) transforming bacterial cells with the ligated 

DNA to create a bacterial cell clone library; 

d) isolating DNA comprising the mammalian cDNA 
from at least one clone in the bacterial cell clone 
library; 

10 e) separately transfecting DNA isolated from 

clones in step (d) into mammalian cells which do not 
express alkaline phosphatase to create a mammalian cell 
clone library wherein each clone in the mammalian cell 
clone library corresponds to a clone in the bacterial 

15 cell clone library; 

f ) identifying a clone in the mammalian cell clone 
library which express alkaline phosphatase; 

g) identifying the clone in the bacterial cell 
clone library corresponding to the clone in the mammalian 

20 cell clone library identified in step (f ) ; and 

h) isolating and sequencing a portion of the 
mammalian cDNA present in the bacterial cell library 
clone identified in step (g) to identify a mammalian cDNA 
encoding a mammalian protein having a signal sequence. 

25 A cDNA library is a collection of nucelic acid 

molecueles that are a cDNA copy of a sample of mRNA. 

In another aspect, the invention features ptrAP3 
expression vector. 

In another aspect, the invention features a 

30 substantially pure preparation of ethb0018f2 protein. 

Preferably, the ethb0018f2 protein includes an amino acid 
sequence substantially identical to the amino acid 
sequence shown in FIG. 5 (SEQ ID NO: 5) ; is derived from 
a mammal, for example, a human. 
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The invention also f eatures^purif ied^DNA (for 
example , cDNA) wh^h=includes:r^^ a 
ei^b0O18£2-^ 

ethb0018f 2=^protein (for example, the ethb0018f2 protein 
5 of FIG. 5; SEQ ID NO: 5); a vector and a cell which 

includes a purified DNA of the invention; and a method of 
producing a recombinant ethb0018f2 protein involving 
providing a cell transformed with DNA encoding ethb0018f2 
protein positioned for expression in the cell, culturing 
10 the transformed cell under conditions for expressing the 
DNA, and isolating the recombinant ethb0018f2 protein • 
The invention further features recombinant ethb0018f2 
protein produced by such expression of a purified DNA of 
the i n ven t ion. 

15 By "ethb0018f2 protein" is meant a polypeptide 

which has a biological activity possesed by naturally- 
occuring ethb0018f2 protein. Preferably, such a 
polypeptide has an amino acid sequence which is at least 
85%, preferably 90%, and most preferably 95% or even 99% 

20 identical to the amino acid sequence of the ethb0018f2 
protein of FIG. 5 (SEQ ID NO: 5) . 

By "substantially identical" is meant a 
polypeptide or nucleic acid having a sequence that is at 
least 85%, preferably 90%, and more preferably 95% or 

25 more identical to the sequence of the reference amino 
acid or nucleic acid sequence. For polypeptides, the 
length of the reference polypeptide sequence will 
generally be at least 16 amino acids, preferably at least 
20 amino acids, more preferably at least 25 amino acids, 

3 0 and most preferably 35 amino acids. For nucleic acids, 
the length of the reference nucleic acid sequence will 
generally be at least 50 nucleotides, preferably at least 
60 nucleotides, more preferably at least 75 nucleotides, 
and most preferably 110 nucleotides. 



1 
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Sequence identity can be measured using sequence 
analysis software (e.g., Sequence Analysis Software 
Package of the Genetics Computer Group, University of 
Wisconsin Biotechnology Center, 1710 University Avenue, 
5 Madison, WI 53705) . 

In the case of polypeptide sequences which are 
less than 100% identical to a reference sequence, the 
non-identical positions are preferably, but not 
necessarily, conservative substitutions for the reference 
10 sequence. Conservative substitutions typically include 
substitutions within the following groups: glycine and 
alanine; valine, isoleucine, and leucine; aspartic acid 
and glutamic acid; asparagine and glutamine; serine and 
threonine; lysine and arginine; and phenylalanine and 
15 tyrosine. 

Where a particular polypeptide is the to have a 
specific percent identity to a reference polypeptide of a 
defined length, the percent identity is relative to the 
reference peptide. Thus, a peptide that is 50% identical 

2 0 to a reference polypeptide that is 100 amino acids long 

can be a 50 amino acid polypeptide that is completely 
identical to a 50 amino acid long portion of the 
reference polypeptide. It might also be a 100 amino acid 
long polypeptide which is 50% identical to the reference 
25 polypeptide over its entire length. Of course, many 
other polypeptides will meet the same criteria. 

By "protein" and "polypeptide" is meant any chain 
of amino acids, regardless of length or post- 
translational modification (e.g., glycosylation or 

3 0 phosphorylation) . 

By "substantially pure" is meant a preparation 
which is at least 60% by weight (dry weight) the compound 
of interest, i.e., a ethb0018f2 protein. Preferably the 
preparation is at least 75%, more preferably at least 
35 90%, and most preferably at least 99%, by weight the 
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compound of interest. Purity can be measured by any 
appropriate method, e.g., column chromatography, 
polyacrylamide gel electrophoresis , or HPLC analysis. 
By "purified DNA" is meant DNA that is not 
5 immediately contiguous with both of the coding sequences 
with which it is immediately contiguous (one on the 5' 
end and one on the 3' end) in the naturally occurring 
genome of the organism from which it is derived. The 
term therefore includes, for example, a recombinant DNA 

10 which is incorporated into a vector; into an autonomously 
replicating plasmid or virus; or into the genomic DNA of 
a prokaryote or eukaryote, or which exists as a separate 
molecule (e.g., a cDNA or a genomic DNA fragment produced 
by PCR or restriction endonuclease treatment) independent 

15 of other sequences. It also includes a recombinant DNA 
which is part of a hybrid gene encoding additional 
polypeptide sequence . 

By "substantially identical" is meant an amino 
acid sequence which differs only by conservative amino 

20 acid substitutions, for example, substitution of one 
amino acid for another of the same class (e.g. , valine 
for glycine, arginine for lysine, etc.) or by one or more 
non-conservative substitutions, deletions, or insertions 
located at positions of the amino acid sequence which do 

25 not destroy the function of the protein (assayed, e.g., 
as described herein) . Preferably, such a sequence is at 
least 85%, more preferably 90%, and most preferably 95% 
identical at the amino acid level to the sequence of FIG. 
5 (SEQ ID NO: 5). For nucleic acids, the length of 

3 0 comparison sequences will generally be at least 50 

nucleotides, preferably at least 60 nucleotides, more 
preferably at least 75 nucleotides, and most preferably 
110 nucleotides. A "substantially identical" nucleic 
acid sequence codes for a substantially identical amino 

35 acid sequence as defined above. 
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By "transformed cell" is meant a cell into which 
(or into an ancestor of which) has been introduced, by 
means of recombinant DNA techniques , a DNA molecule 
encoding (as used herein) ethb0018f2 protein, 
5 By "positioned for expression" is meant that the 

DNA molecule is positioned adjacent to a DNA sequence 
which directs transcription and translation of the 
sequence (i.e., facilitates the production of ethb0018f2 
protein) . 

10 By "purified antibody" is meant antibody which is 

at least 60%, by weight, free from the proteins and 
naturally-occurring organic molecules with which it is 
naturally associated. Preferably, the preparation is at 
least 75%, more preferably at least 90%, and most 

15 preferably at least 99%, by weight, antibody. 

By "specifically binds" is meant an antibody which 
recognizes and binds ethb0018f2 protein but which does 
not substantially recognize and bind other molecules in a 
sample, e.g., a biological sample, which naturally 

20 includes ethb0018f2 protein. 

Unless otherwise defined, all technical and 
scientific terms used herein have the same meaning as 
commonly understood by one of ordinary skill in the art 
to which this invention belongs. Although methods and 

2 5 materials similar or equivalent to those described herein 

can be used in the practice or testing of the present 
invention, the preferred methods and materials are 
described below. All publications, patent applications, 
patents, and other references mentioned herein are 

3 0 incorporated by reference in their entirety. In case of 

conflict, the present specification, including 
definitions, will control. In addition, the materials, 
methods, and examples are illustrative only and not 
intended to be limiting. 
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Other features and advantages of the invention 
will be apparent from the following detailed description, 
and from the claims. 

Brief Description of the Drawings 
5 Figure 1 is a schematic drawing of a portion of 

the ptrAP3 vector. 

Figure 2 is a representation of the DNA sequence 
of the ptrAP3 vector (SEQ ID NO:l). The bold, underlined 
portion is the small fragment removed prior to cDNA 
10 insertion sequence. The italic, underlined portion is 
the alkaline phosphatase sequence. 

Figure 3 is a representation of the amino acid 
sequence of human placental alkaline phosphatase 
(Accession No. P05187) . The underlined portion is the 
15 signal sequence. The bold, underlined portion is the 
membrane anchor sequence. 

Figure 4 is a representation of the amino acid 
sequence of the alkaline phosphatase encoded by ptrAP3 . 

Figure 5 is a representation of the cDNA and amino 
20 acid sequence of a portion of a novel secreted protein 
identified using the method described in Example 1. 

Figure 6 is a representation of an alignment of 
the amino acid sequence of clone ethb0018f2 (referred to 
here as 8f2) and proteins containing conserved IgG 
25 domains. The proteins are D38492 (neural adhesion 
molecule f3) ; P20241EURO (Drosophila Neuroglian) ; 
P32004EURA (human neural adhesion molecule LI) ; P353 31G- 
CA (chick neural adhesion molecule related protein) ; 
Q02246XONI (human Axonin 1) ; U11031 (rat neural adhesion 
30 molecule BIG1) ; and X65224 (chicken Neurofascin) are 
depicted. In this figure, conserved motifs within the 
IgG domain are highlighted in bold. 
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Detailed Description 
In general terms , the method of the invention 
entails the following steps: 

1. Preparation of a randomly primed cDNA library 
5 using cDNA prepared from mRNA extracted from mammalian 

cells or tissue. The cDNA is inserted into a mammalian 
expression vector adjacent to a cDNA encoding placental 
alkaline phosphatase which lacks a secretory signal. 

2. Amplification of the cDNA library in bacteria. 
10 3. Isolation of the cDNA library. 

4 . Transf ection of the resulting cDNA library 
into mammalian cells. 

5. Assay of supernatants from the transf ected 
mammalian cells for alkaline phosphatase activity. 

15 6. Isolation and sequencing of plasmid DNA clones 

registering a positive score in the alkaline phosphatase 
assay. 

7. Isolation of full length cDNA clones of novel 
proteins having a signal sequence. 
20 The mammalian cDNA used to create the cDNA library 

can be prepared using any known method. Generally, the 
cDNA is produced from mRNA. The mRNA can be isolated 
from any desired tissue or cell type. For example, 
peripheral blood cells, primary cells, tumor cells, or 
25 other cells may be used as a source of mRNA. 

The expression vector harboring the modified 
alkaline phosphatase gene can be any vector suitable for 
expression of proteins in mammalian cells. 

The mammalian cells used in the transfection step 
3 0 can be any suitable mammalian cells, e.g., CHO cells, 
mouse L cells, Hela cells, VERO cells, mouse 3T3 cells, 
and 293 cells. 

Described below is a specific example of the 
method of the invention. Also described below are two 
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genes, one known and one novel, identified using -this 
method . 

Example I 

Step 1 Generation of Mammalian Signal Peptide Trap cDNA 
5 Libraries 
Vector 

A cDNA library was prepared using ptrAP3 f a 
mammalian expression vector containing a cDNA encoding 
human» plaaenfca J ; ~a^ C /u/ /^lacking a 

10 signal sequence (FIG. 1 and FIG. 2, SEQ ID NO:l). When 
ptrAP3 is transfected into a mammalian cell line, such as 
COS7 cells, AP protein is neither expressed nor secreted 
since the AP cDNA of ptraAP3 does not encode a 
translation initiating methionine, a signal peptide, or a 

15 membrane anchor sequence. FIG. 3 (SEQ ID NO: 2) provides 
the amino acid sequence of naturally occurring AP. FIG. 
4 (SEQ ID NO: 3) provides the amino acid sequence of the 
form of AP encoded by ptrAP3 . However, insertion of a 
cDNA encoding a signal peptide sequence into ptrAP3 such 

2 0 that the signal sequence within the cDNA is fused to and 
in frame with AP, facilities both the expression and 
secretion of AP protein upon transfection of the DNA into 
COS7 cells or other mammalian cells. The presence of AP 
activity in the supernatants of transfected COS7 cells 

25 therefore indicates the presence of a signal sequence in 
the cDNA of interest. 

cDNA Synthesis and Ligation 

cDNA for ligation to the ptrAP3 vector was 
prepared from messenger RNA isolated from human fetal 
30 brain tissue (Clontech, Palo Alto, CA: Catalog #6525-1) 
by a modification of a commercially available "ZAP cDNA 
synthesis kit" (Stratagene; La Jolla, CA: Catalog # 
200401) . Synthesis of cDNA involved the following steps. 



WO 98/22491 



PCT/US97/20201 



- 12 - 

(a) Single stranded cDNA was synthesized from 5 /zg 
of human fetal brain messenger RNA using a random hexamer 
primer incorporating a Xhol restriction site 
(underlined); 5 ' -CTGACTCGAGNNNNNN-3 ' (SEQ ID NO: 4). This 

5 represented a deviation from the Stratagene protocol and 
resulted in a population of randomly primed cDNA 
molecules. Random priming was employed rather than the 
oligo d(T) priming method suggested by Stratagene in 
order to generate short cDNA fragments, some of which 
10 would be expected to be mRNAs that encode signal 
seguences • 

(b) The single stranded cDNA generated in step (a) 
was rendered double stranded, and DNA linkers containing 
a free EcoRl overhang were ligated to both ends of the 

15 double stranded cDNAs using reagents and protocols from 
the Stratagene ZAP cDNA synthesis kit according to the 
manuf actur er ' s instructions . 

(c) The linker-adapted double-stranded cDNA 
generated in step (b) was digested with Xhol to generate 

2 0 a free Xhol overhang at the 3' end of the cDNAs using 

reagents from the Stratagene ZAP cDNA synthesis kit 
according to the manufacturers instructions. 

(d) Linker-adapted double-stranded cDNAs were size 
selected by gel filtration through SEPHACRYL™ S-500 cDNA 

25 Size Fractionation Columns (Gibco BRL; Bethesda, MD: 
Catalog #18092-015) according to the manufacturers 
instructions . 

(e) Size selected, double-stranded cDNAs 
containing a free EcoRl overhang at the 5' end and a free 

3 0 Xhol overhang at the 3' end were ligated to the ptrAP3 

backbone which had been digested with EcoRl and Xhol and 
purified from the small, released fragment by agarose gel 
electrophoresis . 

(f ) Ligated plasmid DNAs were transformed into E^. 
3 5 Coli strain DHlOb by electroporation. 
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This process resulted in a library of cDNA clones 
composed of several million random primed cDNAs (some of 
which will encode signal sequences) prepared from human 
fetal brain messenger RNA, fused to the AP reporter cDNA, 
5 in the mammalian expression vector ptrAP3. 

Step 2 Plating and Automated Picking of Bacterial 
Colonies 

Next, the transformed bacterial cells were plated, 
and individual clones were identified. A sample of 

10 transformed E. coli containing the random primed human 
fetal brain cDNA library described in Step 1 was plated 
for growth as individual colonies, using standard 
procedures. Each E. coli colony contained an individual 
cDNA clone fused to the AP reporter in the ptrAP3 

15 expression vector. Approximately 2 0,000 such E. coli 

colonies were plated, representing approximately 0.5% of 
the total cDNA library. 

Next, E. coli colonies were picked from the plates 
and inoculated into deep well 96 well plates containing 1 

2 0 ml of growth medium prepared by standard procedures. 

Colonies were picked from the plates and E. coli cultures 
were grown overnight by standard procedures. Each plate 
was identified by number. Within each plate, each well 
contained an individual cDNA clone in the ptrAP vector 
25 identified by well position. 

Finally, plasmid DNA was extracted from the 
overnight E. coli cultures using a semi-automated 96-well 
plasmid DNA miniprep procedure, employing standard 
procedures for bacterial lysis, genomic DNA precipitation 

3 0 and plasmid DNA purification. 

The plasmid DNA extraction was performed as 
follows: 

(a) E. coli were centrifuged for 20 minutes using 
a Beckman Centrifuge at 32 00 rpm. 



WO 98/22491 



PCT/US97/20201 



- 14 - 

(b) Supernatant was discarded and E. coli pellets 
were resuspended in 130 /xl WP1 (50 mM TRIS (pH 7.5) , 10 
mM EDTA, 100 /xg/ml RNase A) resuspension solution using a 
TITERTECK MULTIDROP™ apparatus. 
5 (c) E. coli pellets were resuspended by vortexing. 

(d) 130 /xl WP2 (0.2 M NaOH, 0.5% SDS) lysing 
solution was added to each well, and the samples were 
mixed by vortexing for 5 seconds. 

(e) 130 /xl WP3 (125 mM potassium acetate, pH 4.8) 
10 neutralizing solution was added to each well, and the 

samples were mixed by vortexing for 5 seconds. 

(f ) Samples were placed on ice for 15 minutes, 
mixed by vortexing for 5 seconds, and recentrif uged for 
10 minutes at 3200 rpm in a Beckman Centrifuge. 

15 (g) Supernatant (crude DNA extract) was 

transferred from each well of each 96 well plate into a 
96 well filter plate (Polyf iltronics) using a 
TOMTEC/ Quadra 96™ transfer apparatus. 

(h) 480 /il of Wizard™ Midiprep DNA Purification 
20 Resin (Pr omega) was added to each well of each plate 

containing crude DNA extract using a Titertek Multidrop 
apparatus and the samples were left for 5 minutes. 

(i) Each 96 well filter plate was placed on a 
vacuum housing (Polyf iltronics) and the liquid in each 

25 well was removed by suction generated by vacuum created 

with a Lab Port Vacuum pump. 

(j) The Wizard Midiprep DNA Purification Resin in 

each well (to which plasmid DNA was bound) was washed 

four times with 600 /il of Wizard Wash 1 *. 
30 (k) Plates were centrifuged for 5 minutes to 

remove excessive moisture from the Wizard Midiprep DNA 

Purification Resin. 

(1) Purified plasmid DNAs were eluted from the 

Wizard Midiprep DNA Purification Resin into collection 
3 5 plates by addition of 50 /xl deionized water to each well 
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using a Multidrop 8 Channel Pipette, incubation at room 
temperature for 15 minutes, and centrif ugation for 5 
minutes (3200 rpm, Beckman centrifuge) - 

This process resulted in preparation of plasmid 
5 DNA contained in 96 well plates with each well containing 
an individual cDNA clone ligated in the ptrAP expression 
vector. Individual clones were identified by plate 
number and well position. 

Step 4 Transfection of DNAs into COS7 cells 
10 To determine which of the cDNA clones contained 

within the cDNA library encoded functional signal 

peptides, individual plasmid DNA preparations were 

transfected into COS7 cells as follows. 

For each 96 well plate of DNA preparations, one 96 
15 well tissue culture plate containing approximately 10,000 

COS7 cells per well was prepared using standard 

procedures . 

Immediately prior to DNA transfection, the COS7 
cell culture medium in each well of each 9 6 well plate 

20 was replaced with 80 ul of OptiMEM (Gibco-BRL; catalog 
#31985-021) containing 1 /il of lipof ectamine (Gibco-BRL) 
and 2 /il (approximately 100-200 ng) of DNA prepared as 
described above. Thus, each well of each 9 6 well plate 
containing COS7 cells received DNA representing one 

25 individual cDNA clone from the cDNA library in ptrAP3 . 
The COS7 cells were incubated with the Opti- 
MEM/Lipofectamine/DNA mixture overnight to allow 
transfection of cells with the plasmid DNAs. 

After overnight incubation, the transfection 

3 0 medium was removed from the cells and replaced with 80 /il 
fresh medium composed of Opt i -MEM + 1% fetal calf serum. 
Cells were incubated overnight. 
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Step 5 Alkaline Phosphatase Assay 

The secreted alkaline phosphatase activity of the 
transfected COS7 cells was measured as follows. Samples 
(10 /il) of supernatants from the transfected COS7 cells 
5 were transferred from each well of each 9 6 well plate 
into one well of a Microf luor scintillation plate 
(Dynatech: Location Catalog #011-010-7805) . AP activity 
in the supernatants was determined using the Phospha- 
Light Kit (Tropix Inc.; catalog #BP300) . AP assays were 
10 performed according to the manufacturer's instruction 
using a Wallace Micro-Beta scintillation counter. 

Step 6 Sequencing and Analysis of Positive Clones 

The individual plasmid DNAs scoring positive in 
the COS7 cell AP secretion assay were analyzed further by 

15 DNA sequencing using standard procedures. The resulting 
DNA sequence information was used to perform BLAST 
sequence similarity searches of nucleotide protein 
databases to ascertain whether the clone in question 
encodes either 1) a known secreted or membrane-associated 

20 protein possessing a signal sequence, or 2) a putative 

novel, secreted or membrane-associated protein possessing 
a putative novel signal sequence. 

Identification of the Protein Tyrosine Phosphatase Sioma 
(PTPgl Signal Sequence bv Mammalian Signal Peptide trAP 

25 Employing the method described in Example 1, a 

cDNA clone designated ethb005c07 was found to score 
positive in the COS7 cell transfection AP assay. BliAST 
similarity searching with the DNA sequence from this 
clone identified ethb005c07 as a cDNA encoding the signal 

3 0 sequence of protein tyrosine phosphatase sigma (PTPa) , a 
previously described protein that is well established in 
the scientific literature to be a transmembrane protein 
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(Pulido et al., Proc. Nat'l Acad. Sci. USA 92:11686, 
1995) . 



Identification of a Nove'Wrimunoalo biili^ Domai-n 
C8fT€S5niln^ Signal Peptide trAP 

5 Employing the method described in Example 1, a 

eBffAfclone designated & ^&010W3W$& » va s found to score 
positive in the COS7 cell transfection AP assay. DNA 
sequencing revealed that 'efehbO.O^S^wha^boKSwawl^S^wbase 
Pfr^^eBNA ^E?§^ ing 
10 at ^iflgglve.QifeArd^^^ 5 . 

Thus , the etehb00-l*8*jK^cDNA w 

E eadinq ^Eame (FIG. 5, SEQ ID NO: 5) fHSsied-teowHa^ AP 
reporter. Inspection of the ethb0018f2 protein sequence 
revealed the presence of a putative signal sequence 
15 between amino acids 1 to 20, predicted by the signal 
peptide prediction algorithm, signal P (Von Heijne, 
Nucleic Acids. Reg. 14:4683-90, 1986). ThusTwrefehWO^FS^f 2 
enccSdte's^a^p^rt^ 

sec^ifeedr/imem^ BLAST similarity searching of 

2 0 nucleic acid and protein databases with the ethb0018f2 
DNA sequence from this clone revealed similarity to a 
family of proteins known to contain a protein motif 
referred to as an Immunoglobulin of IgG domain. 

Further visual inspection of the ethb0018f2 
25 protein sequence resulted in the identification of 5 

consecutive IgG repeats, defined by a conserved spacing 
of cysteine, tryptophan, tyrosine, and cysteine residues 
(FIG. 5). 

FIG* 6 is a depiction of a protein sequence 
30 alignment between clone ethb0018f2 (referred to as 8f2) 
and seven related proteins known to contain IgG domains 
that are also known to be expressed in the brain. These 
proteins are rat neural adhesion molecule f3 (D38492), 
Drosophila Neuroglian (P20241) , human neural adhesion 
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molecule LI (P32004) , chick neural adhesion molecule 
related (P35331) , human Axonin 1 (Q02246) , rat neural 
adhesion molecule BIG1 (U11031) and chicken Neurofascin 
(X65224) . Given this sequence similarity , it is likely 
5 that clone ethb0018f2 represents a partial cDNA cone 
representing a novel protein , expressed in the brain, 
which contains multiple, consecutive IgG domains. 
Specifically, since the closest relatiaves of clone 
ethb0018f2 are believed to function as neural adhesion 
10 molecules, it is likely that clone ethb0018f2 represents 
a partial cDNA clone of a novel neural adhesion molecule. 

Other Embodiments 
It is to be understood that while the invention 
has been described in conjunction with the detailed 
15 description thereof, that the foregoing description is 
intended to illustrate and not limit the scope of the 
invention, which is defined by the scope of the appended 
claims. 
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(iii) NUMBER OF SEQUENCES: 14 



(iv) CORRESPONDENCE ADDRESS: 



(A 
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(C 
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(F 



(v) COMPUTER READABLE FORM: 



(A 
(B 
(C 
(D 

(vi) 
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( vii 
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(B 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4951 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AAGCTTGGCT GTGGAATGTG TGTCAGTTAG GGTGTGGAAA GTCCCCAGGC TCCCCAGCAG 60 

GCAGAAGTAT GCAAAGCATG CATCTCAATT AGTCAGCAAC CAGGTGTGGA AAGTCCCCAG 120 

GCTCCCCAGC AGGCAGAAGT ATGCAAAGCA TGCATCTCAA TTAGTCAGCA ACCATAGTCC 180 

CGCCCCTAAC TCCGCCCATC CCGCCCCTAA CTCCGCCCAG TTCCGCCCAT TCTCCGCCCC 240 

ATGGCTGACT AATTTTTTTT ATTTATGCAG AGGCCGAGGC CGCCTCGGCC TCTGAGCTAT 300 

TCCAGAAGTA GTGAGGAGGC TTTTTTGGAG GCCTAGGCTT TTGCAAAAAG CTCCTCCGAT 360 

CGAGGGGCTC GCATCTCTCC TTCACGCGCC CGCCGCCCTA CCTGAGGCCG CCATCCACGC 420 

CGGTTGAGTC GCGTTCTGCC GCCTCCCGCC TGTGGTGCCT CCTGAACTGC GTCCGCCGTC 480 

TAGGTAAGTT TAAAGCTCAG GTCGAGACCG GGCCTTTGTC CGGCGCTCCC TTGGAGCCTA 540 

CCTAGACTCA GCCGGCTCTC CACGCTTTGC CTGACCCTGC TTGCTCAACT CTACGTCTTT 600 

GTTTCGTTTT CTGTTCTGCG CCGTTACAGA TCCAAGCTCT GAAAAACCAG AAAGTTAACT 660 

GGTAAGTTTA GTCTTTTTGT CTTTTATTTC AGGTCCCAGG TCCCGGATCC GGTGATCCAA 720 

ATCTAAGAAC TGCTCCTCAG TGAGTGTTGC CTTTACTTCT AGGCCTGTAC GGAAGTGTTA 780 

CTTCTGCTCT AAAAGCTGCG GAATTCGCAC CACCGTAGTT TTTACGCCCG GTGAGCGCTC 840 
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CACCCGCACC 
GGCCAACGAG 
GCCGCTGGAC 
GCCCACGCTT 
ACCCACCGTG 
GACCGTGGAG 
GGGACTGGGC 
CACTGCCACA 
AGTTGAGGAG 
CAAGAAGCTG 
GATGGGGGTG 
GGGGCCTGAG 
CAATGTAGAC 
CAAGGGCAAC 
GACACGCGGC 
GGGAGTGGTA 
GGTGAACCGC 
CCAGGACATC 
CCGAAAGTAC 
AGGTGGGACC 
TGCCCGGTAT 
CCATCTCATG 
ACTGGACCCC 
CCGCGGCTTC 
GGCTTACCGG 
GCTCACCAGC 
CTTCGGAGGC 
GGACAGGAAG 
CGGCGCCCGG 
AGCAGTGCCC 
CCCGCAGGCG 
CTTCGCCGCC 
CGACGCCGCG 
ACCTGAAACA 
GTTACAAATA 
CTAGTTGTGG 
GAGCTCGAAT 
GGCTGCGGCG 
GGGATAACGC 
AGGCCGCGTT 
GACGCTCAAG 
CTGGAAGCTC 
CCTTTCTCCC 
CGGTGTAGGT 
GCTGCGCCTT 
CACTGGCAGC 
AGTTCTTGAA 
CTCTG CTGAA 
CCACCGCTGG 
GATCTCAAGA 
CACGTTAAGG 
ATTAAAAATG 
ACCAATGCTT 
TTGCCTGACT 
GTGCTGCAAT 
AGCCAGCCGG 
CTATTAATTG 
TTGTTGCCAT 
GCTCCGGTTC 
TTAGCTCCTT 
TGGTTATGGC 
TGACTGGTGA 
CTTGCCCGGC 
TCATTGGAAA 
GTTCGATGTA 
TTTCTGGGTG 
GGAAATGTTG 
ATTGTCTCAT 
CGCGCACATT 



TACAAGCGCG 
CGCCTCGGGG 
GAGGGCAACC 
GCACCGTCCG 
CAGCTGATGG 
CCTGGGCTGG 
GTGCAGACCG 
GAGGG CATGG 
GAGAACCCGG 
CAGCCTGCAC 
TCTACGGTGA 
ATACCCCTGG 
AAACATGTGC 
TTCCAGACCA 
AACGAGGTCA 
ACCACCACAC 
AACTGGTACT 
GCTACGCAGC 
ATGTTTCG C A 
AGGCTGGACG 
GTGTGGAACC 
GGTCTCTTTG 
TCCCTGATGG 
TTCCTCTTCG 
GCACTGACTG 
GAGGAGGACA 
TACCCCCTGC 
GCCTACACGG 
CCGGATGTTA 
CTGGACGAAG 
CACCTGGTTC 
TGCCTGGAGC 
CACCCGGGTT 
TAAAATGAAT 
AAGCAATAGC 
TTTGT CCAAA 
TAATTCCTCT 
AGCGGTATCA 
AGGAAAGAAC 
GCTGGCGTTT 
TCAGAGGTGG 
CCTCGTGCGC 
TTCGGGAAGC 
CGTTCGCTCC 
ATCCGGTAAC 
AGCCACTGGT 
GTGGTGGCCT 
GCCAGTTACC 
TAGCGGTGGT 
AGATCCTTTG 
GATTTTGGTC 
AAGTTTTAAA 
AATCAGTGAG 
CCCCGTCGTG 
GATACCGCGA 
AAGGGCCGAG 
TTGCCGGGAA 
TGCTACAGGC 
CCAACGATCA 
CGGTCCTCCG 
AGCACTGCAT 
GTACTCAACC 
GTCAATACGG 
ACGTTCTTCG 
ACCCACTCGT 
AGCAAAAACA 
AATACTCATA 
GAGCGGATAC 
TCCCCGAAAA 



TGTATGATGA 
AGTTTGCCTA 
CAACACCTAG 
AAGAAAAGCG 
TACCCAAGCG 
AGCCCGAGGT 
TGGACGTTCA 
AGACACAAAC 
ACT TCTGG AA 
AGACAGCCGC 
CAGCTGCCAG 
CCATGGACCG 
CAGACAGTGG 
TTGGCTTGAG 
TCTCCGTGAT 
GAGTGCAGCA 
CGGACGCCGA 
TCATCTCCAA 
TGGGAACCCC 
GGAAGAATCT 
GCACTGAGCT 
AGCCTGGAGA 
AGATGACAGA 
TGGAGGGTGG 
AGACGATCAT 
CGCTGAGCCT 
GAGGG AG CTC 
TCCTCCTATA 
CCGAGAGCGA 
AGACCCACGC 
ACGGCGTGCA 
CCTACACCGC 
GAACTAGTCT 
GCAATTGTTG 
ATCACAAATT 
CT CATCAATG 
TCCGCTTCCT 
GCTCACTCAA 
ATGTGAGCAA 
TTCCATAGGC 
CGAAACCCGA 
TCTCCTGTTC 
GTGGCGCTTT 
AAGCTGGGCT 
TATCGTCTTG 
AACAGGATTA 
AACTACGGCT 
TTCGGAAAAA 
TTTTTTGTTT 
ATCTTTTCTA 
ATGAGATTAT 
TCAATCTAAA 
GCACCTATCT 
TAGATAACTA 
GACCCACGCT 
CGCAGAAGTG 
GCTAGAGTAA 
ATCGTGGTGT 
AGG CGAGTTA 
ATCGTTGTCA 
AATTCTCTTA 
AAGTCATTCT 
GATAATACCG 
GGGCGAAAAC 
GCACCCAACT 
GGAAGGCAAA 
CTCTTCCTTT 
ATATTTGAAT 
GTGCCACCTG 



20 - 

GGTGTACGGC 
CGGAAAGCGG 
CCTAAAGCCC 
CGGCCTAAAG 
CCAGCGACTG 
CCGCGTGCGG 
GATACCCACC 
GTCCCCGGTT 
CCGCGAGGCA 
CAAGAACCTC 
GATCCTAAAA 
CTTCCCATAT 
AGCCACAGCC 
TGCAGCCGCC 
GAATCGGGCC 
CGCCTCGCCA 
CGTGCCTGCC 
CATGGACATT 
AGACCCTGAG 
GGTGCAGGAA 
CATGCAGGCT 
CATGAAATAC 
GGCTGCCCTG 
TCGCATCGAC 
GTTCGACGAC 
CGTCACTGCC 
CATCTTCGGG 
CGGAAACGGT 
GAGCGGGAGC 
AGGCGAGGAC 
GGAGCAGACC 
CTGCGACCTG 
AGAGAAAAAA 
TTGTTAACTT 
TCACAAATAA 
TATCTTATCA 
CGCTCACTGA 
AGGCGGTAAT 
AAGG CCAGCA 
TCCGCCCCCC 
CAGGACTATA 
CGACCCTGCC 
CTCAATGCTC 
GTGTG CACG A 
AGTCCAACCC 
GCAGAGCGAG 
ACACTAGAAG 
GAGTTGGTAG 
GCAAGCAGCA 
CGGGGTCTGA 
CAAAAAGGAT 
GTATATATGA 
CAGCGATCTG 
CGATACGGGA 
CACCGGCTCC 
GTCCTGCAAC 
GTAGTTCGCC 
CACGCTCGTC 
CATGATCCCC 
GAAGTAAGTT 
CTGTCATGCC 
GAGAATAGTG 
CGCCACATAG 
T CTC AAGG AT 
GATCTTCAGC 
ATGCCGCAAA 
TTCAATATTA 
GTATTTAGAA 
C 



GACGAGGACC 
CATAAGGACA 
GTGACACTGC 
CGCGAGTCTG 
GAAGATGTCT 
CCAATCAAGC 
ACCAGTAGCA 
GCCTAGCTCG 
GCCGAGGCCC 
ATCATCTTCC 
GGGCAGAAGA 
GTGGCTCTGT 
ACGGCCTACC 
CGCTTTAACC 
AAGAAAGCAG 
GCCGGCACCT 
TCGGCCCGCC 
GACGTGATCC 
TACCCAGATG 
TGGCTGGCGA 
TCCCTGGACC 
GAGATCCACC 
CGCCTGCTGA 
CATGGTCATC 
GCCATTGAGA 
GACCACTCCC 
CTGGCCCCTG 
CCAGGCTATG 
CCCGAGTATC 
GTGGCGGTGT 
TTCATAGCGC 
GCGCCCCCCG 
CCTCCCACAC 
GTTTATTGCA 
AGCATTTTTT 
TGTCTGGATC 
CTCGCTGCGC 
ACGGTTATCC 
AAAGGCCAGG 
TG ACG AG CAT 
AAGATACCAG 
GCTTACCGGA 
ACGCTGTAGG 
ACCCCCCGTT 
GGTAAGACAC 
GTATGTAGGC 
GACAGTATTT 
CTCTTGATCC 
GATTACGCGC 
CGCTCAGTGG 
CTTCACCTAG 
GTAAACTTGG 
TCTATTTCGT 
GGGCTTACCA 
AGATTTATCA 
TTTATCCGCC 
AGTTAATAGT 
GTTTGGTATG 
CATGTTGTGC 
GGCCGCAGTG 
ATCCGTAAGA 
TATGCGGCGA 
CAGAACTTTA 
CTTACCGCTG 
ATCTTTTACT 
AAAGGGAATA 
TTGAAG CATT 
AAATAAACAA 
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TGCTTGAGCA 


900 


TGTTGGCGTT 


960 


AGCAGGTGCT 


1020 


GTGACTTGGC 


1080 


TGGAAAAAAT 


1140 


AGGTGGCACC 


1200 


CTAGTATTGC 


1260 


AGATCATCCC 


1320 


TGGGTGCCGC 


1380 


TGGGCGATGG 


1440 


AGGACAAACT 


1500 


CCAAGACATA 


1560 


TGTGCGGGGT 


1620 


AGTGCAACAC 


1680 


GGAAGT CAGT 


1740 


ACGCCCACAC 


1800 


AGGAGGGGTG 


1860 


TAGGTGGAGG 


1920 


ACTACAGCCA 


1980 


AGCGCCAGGG 


2040 


CGTCTGTGAC 


2100 


GAGACTCCAC 


2160 


GCAGGAACCC 


2220 


ATGAAAGCAG 


2280 


GGGCGGGCCA 


2340 


ACGTCTTCTC 


2400 


GCAAGGCCCG 


2460 


TGCTCAAGGA 


2520 


GGCAGCAGTC 


2580 


TCGCGCGCGG 


2640 


ACGTCATGGC 


2700 


CCGGCACCAC 


2760 


CTCCCCCTGA 


2820 


GCTTATAATG 


2880 


TCACTGCATT 


2940 


CCCGGGTACC 


3000 


TCGGTCGTTC 


3060 


ACAGAATCAG 


3120 


AACCGTAAAA 


3180 


CACAAAAATC 


3240 


GCGTTTCCCC 


3300 


TACCTGTCCG 


3360 


TATCTCAGTT 


3420 


CAGCCCGACC 


3480 


GACTT ATCG C 


3540 


GGTGCTACAG 


3600 


GGTATCTGCG 


3660 


GGCAAACAAA 


3720 


AGAAAAAAAG 


3780 


AACGAAAACT 


3840 


ATCCTTTTAA 


3900 


TCTGACAGTT 


3960 


TCATCCATAG 


4020 


TCTGGCCCCA 


4080 


GCAATAAACC 


4140 


TCCATCCAGT 


4200 


TTGCGCAACG 


4260 


GCTTCATTCA 


4320 


AAAAAAGCGG 


4380 


TTATCACTCA 


4440 


TGCTTTTCTG 


4500 


CCGAGTTGCT 


4560 


AAAGTGCTCA 


4620 


TTGAGATCCA 


4680 


TTCACCAGCG 


4740 


AGGGCGACAC 


4800 


TATCAGGGTT 


4860 


ATAGGGGTTC 


4920 




4951 
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(2) INFORMATION FOR SEQ ID NO: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 530 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID 



Met 


Leu 


Leu 


Leu 


Leu 


Leu 


Leu 


Leu 


Gly 


Leu 


1 








5 










10 


Gly 


lie 


lie 


Pro 


Val 


Glu 


Glu 


Glu 


Asn 


Pro 






20 










25 




Ala 


Ala 


Glu 


Ala 


Leu 


Gly 


Ala 


Ala 


Lys 


Lys 






35 










40 






Ala 


Ala 


Lys 


Asn 


Leu 


He 


He 


Phe 


Leu 


Gly 




50 








55 








Thr 


Val 


Thr 


Ala 


Ala 


Arg 


He 


Leu 


Lys 


Gly 


65 










70 










Gly 


Pro 


Glu 


He 


Pro 


Leu 


Ala 


Met 


Asp 


Arg 








85 










90 


Ser 


Lys 


Thr 


Tyr 


Asn 


Val 


Asp 


Lys 


His 


Val 








100 










105 




Ala 


Thr 


Ala 


Tyr 


Leu 


Cys 


Gly 


Val 


Lys 


Gly 






115 










120 






Leu 


Ser 


Ala 


Ala 


Ala 


Arg 


Phe 


Asn 


Gin 


Cys 




130 










135 








Glu 


Val 


lie 


Ser 


Val 


Met 


Asn 


Arg 


Ala 


Lys 


145 










150 










Gly 


Val 


Val 


Thr 


Thr 


Thr 


Arg 


Val 


Gin 


His 








165 










170 


Tyr 


Ala 


His 


Thr 


Val 


Asn 


Arg 


Asn 


Trp 


Tyr 






180 










185 




Ala 


Ser 


Ala 


Arg 


Gin 


Glu 


Gly 


Cys 


Gin 


Asp 






195 










200 






Ser 


Asn 


Met 


Asp 


He 


Asp 


Val 


He 


Leu 


Gly 




210 










215 








Phe 


Arg 


Met 


Gly 


Thr 


Pro 


Asp 


Pro 


Glu 


Tyr 


225 










230 










Gly 


Gly 


Thr 


Arg 


Leu 


Asp 


Gly 


Lys 


Asn 


Leu 






245 










250 


Lys 


Arg 


Gin 


Gly 


Ala 


Arg 


Tyr 


Val 


Trp 


Asn 








260 










265 




Ala 


Ser 


Leu 


Asp 


Pro 


Ser 


Val 


Thr 


His 


Leu 






275 








280 






Gly 


Asp 


Met 


Lys 


Tyr 


Glu 


He 


His 


Arg 


Asp 




290 










295 








Leu 


Met 


Glu 


Met 


Thr 


Glu 


Ala 


Ala 


Leu 


Arg 


305 










310 










Arg 


Gly 


Phe 


Phe 


Leu 


Phe 


Val 


Glu 


Gly 


Gly 










325 










330 


His 


Glu 


Ser 


Arg 


Ala 


Tyr 


Arg 


Ala 


Leu 


Thr 








340 










345 




Asp 


Ala 


He 


Glu 


Arg 


Ala 


Gly 


Gin 


Leu 


Thr 






355 










360 






Ser 


Leu 


Val 


Thr 


Ala 


Asp 


His 


Ser 


His 


Val 




370 








375 








Pro 


Leu 


Arg 


Gly 


Ser 


Ser 


He 


Phe 


Gly 


Leu 


385 










390 










Asp 


Arg 


Lys 


Ala 


Tyr 


Thr 


Val 


Leu 


Leu 


Tyr 










405 










410 


Val 


Leu 


Lys 


Asp 


Gly 


Ala 


Arg 


Pro 


Asp 


Val 








420 










425 




Ser 


Pro 


Glu 


Tyr 


Arg 


Gin 


Gin 


Ser 


Ala 


Val 






435 






440 






His 


Ala 


Gly 


Glu 


Asp 


Val 


Ala. 


Val 


Phe 


Ala 



450 455 
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NO: 2: 



Arg 


Leu 


Gin 


Leu 


Ser 


Leu 








15 




Asp 


Phe 


Trp 


Asn 
30 


Arg 


Glu 


Leu 


Gin 


Pro 
45 


Ala 


Gin 


Thr 


Asp 


Gly 


Met 


Gly Val 


Ser 




60 










Gin 


Lys 


Lys 


Asp 


Lys 


Leu 


75 










80 


Phe 


Pro 


Tyr 


Val 


Ala 


Leu 








95 




Pro 


Asp 


Ser 


Gly Ala 


Thr 








110 






Asn 


Phe 


Gin 


Thr 


He 


Gly 






125 






Asn 


Thr 
140 


Thr 


Arg 


Gly 


Asn 


Lys 


Ala 


Gly 


Lys 


Ser 


Val 


155 










160 


Ala 


Ser 


Pro 


Ala Gly 


Thr 










175 




Ser 


Asp 


Ala 


Asp 
190 


Val 


Pro 


He 


Ala 


Thr 
205 


Gin 


Leu 


He 


Gly 


Gly 
220 


Arg 


Lys 


Tyr 


Met 


Pro 


Asp 


Asp 


Tyr 


Ser 


Gin 


235 










240 


Val 


Gin 


Glu 


Trp 


Leu 


Ala 








255 




Arg 


Thr 


Glu 


Leu 


Met 


Gin 






270 






Met 


Gly 


Leu 


Phe 


Glu 


Pro 




285 








Ser 


Thr 
300 


Leu 


Asp 


Pro 


Ser 


Leu 


Leu 


Ser 


Arg 


Asn 


Pro 


315 








320 


Arg 


He 


Asp 


His 


Gly 
335 


His 


Glu 


Thr 


He 


Met 
350 


Phe 


Asp 


Ser 


Glu 


Glu 


Asp 


Thr 


Leu 






365 






Phe 


Ser 


Phe Gly 


Gly 


Tyr 




380 










Ala 


Pro 


Gly 


Lys 


Ala 


Arg 


395 










400 


Gly 


Asn 


Gly 


Pro 


Gly 
415 


Tyr 


Thr 


Glu 


Ser 


Glu 


Ser 


Gly 








430 




Pro 


Leu 


Asp 
445 


Glu 


Glu 


Thr 


Arg 


Gly 


Pro 


Gin 


Ala 


His 


460 
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Leu 


Val 


His 


Gly 


Val 


Gin 


465 








470 


Phe 


Ala 


Ala 


Cys 


Leu 


Glu 








485 




Ala 


Gly 


Thr 


Thr 
500 


Asp 


Ala 


Leu 


Leu 


Pro 
515 


Leu 


Leu 


Ala 


Ala 


Pro 
530 
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Glu Gin Thr Phe He Ala 
475 

Pro Tyr Thr Ala Cys Asp 
490 

Ala His Pro Gly Arg Ser 
505 

Gly Thr Leu Leu Leu Leu 
520 
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His 


Val 


Met 


Ala 








480 


Leu 


Ala 


Pro 


Pro 






495 




Val 


Val 


Pro 


Ala 




510 






Glu 


Thr 


Ala 


Thr 


525 









(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 489 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



He 


He 


Pro 


Val 


Glu 


Glu 


Glu 


Asn 


Pro 


Asp 


Phe 


Trp 


Asn 


Arg 


Glu 


Ala 


1 








5 










10 










15 




Ala 


Glu 


Ala 


Leu 


Gly 


Ala 


Ala 


Lys 


Lys 


Leu 


Gin 


Pro 


Ala 


Gin 


Thr 


Ala 








20 








25 










30 






Ala 


Lys 


Asn 
35 


Leu 


He 


He 


Phe 


Leu 
40 


Gly 


Asp 


Gly 


Met 


Gly 
45 


Val 


Ser 


Thr 


Val 


Thr 
50 


Ala 


Ala 


Arg 


He 


Leu 
55 


Lys 


Gly 


Gin 


Lys 


Lys 
60 


Asp 


Lys 


Leu 


Gly 


Pro 


Glu 


He 


Pro 


Leu 


Ala 


Met 


Asp 


Arg 


Phe 


Pro 


Tyr 


Val 


Ala 


Leu 


Ser 


65 










70 










75 










80 


Lys 


Thr 


Tyr 


Asn 


Val 


Asp 


Lys 


His 


Val 


Pro 


Asp 


Ser 


Gly 


Ala 


Thr 


Ala 










85 










90 








95 




Thr 


Ala 


Tyr 


Leu 
100 


Cys 


Gly 


val 


Lys 


Gly 
105 


Asn 


Pne 


Gin 


Thr 


lie 
110 


Gly 


Leu 


Ser 


Ala 


Ala 


Ala 


Arg 


Phe 


Asn 


Gin 


Cys 


Asn 


Thr 


Thr 


Arg 


Gly 


Asn 


Glu 






115 










120 








125 








Val 


He 
130 


Ser 


Val 


Met 


Asn 


Arg 
135 


Ala 


Lys 


Lys 


Ala 


Gly 
140 


Lys 


Ser 


Val 


Gly 


Val 


Val 


Thr 


Thr 


Thr 


Arg 


Val 


Gin 


His 


Ala 


Ser 


Pro 


Ala 


Gly 


Thr 


Tyr 


145 










150 










155 










160 


Ala 


His 


Thr 


Val 


Asn 
165 


Arg 


Asn 


Trp 


Tyr 


Ser 
170 


Asp 


Ala 


Asp 


Val 


Pro 
175 


Ala 


Ser 


Ala 


Arg 


Gin 


Glu 


Gly 


Cys 


Gin 


Asp 


He 


Ala 


Thr 


Gin 


Leu 


He 


Ser 






180 










185 










190 






Asn 


Met 


Asp 
195 


He 


Asp 


Val 


He 


Leu 
200 


Gly 


Gly 


Gly 


Arg 


Lys 
205 


Tyr 


Met 


Phe 


Arg 


Met 
210 


Gly 


Thr 


Pro 


Asp 


Pro 
215 


Glu 


Tyr 


Pro 


Asp 


Asp 
220 


Tyr 


Ser 


Gin 


Gly 


Gly 


Thr 


Arg 


Leu 


Asp 


Gly 


Lys 


Asn 


Leu 


Val 


Gin 


Glu 


Trp 


Leu 


Ala 


Lys 


225 










230 










235 










240 


Arg 


Gin 


Gly 


Ala 


Arg 
245 


Tyr 


Val 


Trp 


Asn 


Arg 
250 


Thr 


Glu 


Leu 


Met 


Gin 
255 


Ala 


Ser 


Leu 


Asp 


Pro 


Ser 


Val 


Thr 


His 


Leu 


Met 


Gly 


Leu 


Phe 


Glu 


Pro 


Gly 






260 










265 








270 




Asp 


Met 


Lys 
275 


Tyr 


Glu 


He 


His 


Arg 
280 


Asp 


Ser 


Thr 


Leu 


Asp 
285 


Pro 


Ser 


Leu 


Met 


Glu 


Met 


Thr 


Glu 


Ala 


Ala 


Leu 


Arg 


Leu 


Leu 


Ser 


Arg 


Asn 


Pro 


Arg 




290 










295 










300 








Gly 


Phe 


Phe 


Leu 


Phe 


Val 


Glu 


Gly 


Gly 


Arg 


He 


Asp 


His 


Gly 


His 


His 


305 










310 










315 










320 


Glu 


Ser 


Arg 


Ala 


Tyr 


Arg 


Ala 


Leu 


Thr 


Glu 


Thr 


He 


Met 


Phe 


Asp 


Asp 










325 










330 










335 


Ala 


He 


Glu 


Arg 
340 


Ala 


Gly 


Gin 


Leu 


Thr 
345 


Ser 


Glu 


Glu 


Asp 


Thr 
350 


Leu 


Ser 


Leu 


Val 


Thr 


Ala 


Asp 


His 


Ser 


His 


Val 


Phe 


Ser 


Phe 


Gly 


Gly 


Tyr 


Pro 



355 360 365 



WO 98/22491 

- 23 



Leu 


Arg 
370 


Gly 


Ser 


Ser 


lie 


Phe 
375 


Gly 


Leu 


Arg 


Lys 


Ala 


Tyr 


Thr 


Val 


Leu 


Leu 


Tyr 


385 










390 








Leu 


Lys 


Asp 


Gly 


Ala 
405 


Arg 


Pro 


Asp 


Val 


Pro 


Glu 


Tyr 


Arg 
420 


Gin 


Gin 


Ser 


Ala 


Val 
425 


Ala 


Gly 


Glu 


Asp 


Val 


Ala 


Val 


Phe 


Ala 




435 










440 




Val 


His 


Gly 


Val 


Gin 


Glu 


Gin 


Thr 


Phe 




450 








455 






Ala 


Ala 


Cys 


Leu 


Glu 


Pro 


Tyr 


Thr 


Ala 


465 








470 








Gly 


Thr 


Thr 


Asp 


Ala 


Ala 


His 


Pro 


Gly 



485 
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Ala 


Pro 


Gly 
380 


Lys 


Ala 


Arg 


Asp 


Gly 


Asn 
395 


Gly 


Pro 


Gly 


Tyr 


Val 
400 


Thr 


Glu 


Ser 


Glu 


Ser 


Gly 


Ser 


410 










415 




Pro 


Leu 


Asp 


Glu 


Glu 
430 


Thr 


His 


Arg 


Gly 


Pro 


Gin 
445 


Ala 


His 


Leu 


He 


Ala 


His 
460 


Val 


Met 


Ala 


Phe 


Cys 


Asp 
475 


Leu 


Ala 


Pro 


Pro 


Ala 
480 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CTGGACTCGA GNNNNNN 17 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 465 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Met 


Trp 


Leu 


Val 


Thr 


Phe 


Leu 


Leu 


Leu 


Leu 


Asp 


Ser 


Leu 


His 


Lys 


Ala 


1 






5 










10 










15 




Arg 


Pro 


Glu 


Asp 


Val 


Gly 


Thr 


Ser 


Leu 


Tyr 


Phe 


Val 


Asn 


Asp 


Ser 


Leu 






20 










25 










30 






Gin 


Gin 


Val 


Thr 


Phe 


Ser 


Ser 


Ser 


Val 


Gly 


Val 


Val 


Val 


Pro 


Cys 


Pro 






35 










40 








45 








Ala 


Ala 


Gly 


Ser 


Pro 


Ser 


Ala 


Ala 


Leu 


Arg 


Trp 


Tyr 


Leu 


Ala 


Thr 


Gly 




50 








55 










60 










Asp 


Asp 


He 


Tyr 


Asp 


Val 


Pro 


His 


He 


Arg 


His 


Val 


His 


Ala 


Asn 


Gly 


65 








70 










75 










80 


Thr 


Leu 


Gin 


Leu 


Tyr 
85 


Pro 


Phe 


Ser 


Pro 


Ser 
90 


Ala 


Phe 


Asn 


Ser 


Phe 
95 


He 


His 


Asp 


Asn 


Asp 


Tyr 


Phe 


Cys 


Thr 


Ala 


Glu 


Asn 


Ala 


Ala 


Gly 


Lys 


He 






100 










105 










110 






Arg 


Ser 


Pro 


Asn 


He 


Arg 


Val 


Lys 


Ala 


Val 


Phe 


Arg 


Glu 


Pro 


Tyr 


Thr 




115 










120 










125 








Val 


Arg 
130 


Val 


Glu 


Asp 


Gin 


Arg 
135 


Ser 


Met 


Arg 


Gly 


Asn 
140 


Val 


Ala 


Val 


Phe 


Lys 


Cys 


Leu 


lie 


Pro 


Ser 


Ser 


Val 


Gin 


Glu 


Tyr 


Val 


Ser 


Val 


Val 


Ser 


145 








150 










155 










160 


Trp 


Glu 


Lys 


Asp 


Thr 


Val 


Ser 


He 


He 


Pro 


Glu 


Asn 


Arg 


Phe 


Phe 


He 




165 










170 










175 




Thr 


Tyr 


His 


Gly 
180 


Gly 


Leu 


Tyr 


He 


Ser 
185 


Asp 


Val 


Gin 


Lys 


Glu 
190 


Asp 


Ala 
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Leu 


ser 


Tnr 


Tyr 


Arg 


Cys 


lie 


Thr 






195 










200 


Arg 


Gin 


Ser 


Asn 


Gly 


Ala 


Arg 


Leu 




210 










215 




lie 


Pro 


Thr 


lie 


Leu 


Asp 


Gly 


Phe 


225 










230 




His 


Thr 


Val 


Glu 


Leu 


Pro 


Cys 


Thr 










245 






lie 


Arg 


Trp 


Leu 


Lys 


Asp 


Gly 


Arg 








260 










Thr 


Lys 


Arg 


lie 


Thr 


Gly 


Leu 


Thr 






275 










280 


Ser 


Gly 


Thr 


Tyr 


lie 


Cys 


Glu 


Val 




290 










295 




Ala 


Thr 


Gly 


lie 


Leu 


Met 


Val 


lie 


305 










310 






Pro 


Lys 


Lys 


Leu 


Lys 


Thr 


Gly 


He 










325 








Ala 


Leu 


Thr 


Gly 


Ser 


Pro 


Glu 


Phe 








340 










Glu 


Leu 


Val 


Leu 


Pro 


Asp 


Glu 


Ala 






355 










360 


Glu 


Thr 


Leu 


Leu 


lie 


Thr 


Ser 


Ala 




370 










375 




Gin 


Cys 


Phe 


Ala 


Thr 


Arg 


Lys 


Ala 


385 










390 






lie 


Ala 


Leu 


Glu 


Asp 


Gly 


Thr 


Pro 










405 








Lys 


Val 


Val 


Asn 


Pro 


Gly 


Glu 


Gin 








420 










Gly 


Ala 


Pro 


Pro 


Pro 


Thr 


Val 


Thr 






435 










440 


Val 


Arg 


Asp 


Gly 


Ser 


His 


Arg 


Thr 




450 










455 





Thr 
465 
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Lys 


His 


Lys 


Tyr 


Ser 


Gly 


Glu 


Thr 










205 








Ser 


Val 


Thr 


Asp 


Pro 


Ala 


Glu 


Ser 








220 










His 


Ser 


Gin 


Glu 


Val 


Trp 


Ala 


Gly 






235 










240 


Ala 


Ser 


Gly 


Tyr 


Pro 


He 


Pro 


Ala 




250 










255 




Pro 


Leu 


Pro 


Ala 


Asp 


Ser 


Arg 


Trp 


265 










270 






He 


Ser 


Asp 


Leu 


Arg 


Thr 


Glu 


Asp 










285 






Thr 


Asn 


Thr 


Phe 


Gly 


Ser 


Ala 


Glu 








300 










Asp 


Pro 


Leu 


His 


Val 


Thr 


Leu 


Thr 






315 










320 


Gly 


Ser 


Thr 


Val 


He 


Leu 


Ser 


C Y S 




330 










335 


Thr 


He 


Arg 


Trp 


Tyr 


Arg 


Asn 


Thr 


345 










350 






He 


Ser 


He 


Arg 


Gly 


Leu 


Ser 


Asn 








365 








Gin 


Lys 


Ser 


His 


Ser 


Gly 


Ala 


Tyr 








380 








Gin 


Thr 


Ala 


Gin 


Asp 


Phe 


Ala 


He 






395 










400 


Arg 


He 


Val 


Ser 


Ser 


Phe 


Ser 


Glu 




410 










415 




Phe 


Ser 


Leu 


Met 


Cys 


Ala 


Ala 


Lys 


425 










430 




Trp 


Ala 


Leu 


Asp 


Asp 


Glu 


Pro 


He 










445 








Asn 


Gin 


Tyr 


Thr 


Met 


Ser 


Asp 


Gly 



460 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1493 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

<ix) FEATURE: 

(A) NAME /KEY: Coding Sequence 
<B) LOCATION: 99,.. 1493 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



GGCACGAGGG CGG CTGGG AG CGCGCTGAGC GGGGGAGAGG CGCTGCCGCA CGGCCGGCCA 60 
CAGGACCACC TCCCCGGAGA ATAGGGCCTC TTTATGGC ATG TGG CTG GTA ACT TTC 116 

Met Trp Leu Val Thr Phe 
1 5 

CTC CTG CTC CTG GAC TCT TTA CAC AAA GCC CGC CCT GAA GAT GTT GGC 164 
Leu Leu Leu Leu Asp Ser Leu His Lys Ala Arg Pro Glu Asp Val Gly 
10 15 20 



ACC AGC CTC TAC TTT GTA AAT GAC TCC TTG CAG CAG GTG ACC TTT TCC 212 
Thr Ser Leu Tyr Phe Val Asn Asp Ser Leu Gin Gin Val Thr Phe Ser 
25 30 35 



AGC TCC GTG GGG GTG GTG GTG CCC TGC CCG GCC GCG GGC TCC CCC AGC 
Ser Ser Val Gly Val Val Val Pro Cys Pro Ala Ala Gly Ser Pro Ser 
40 45 50 



260 
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GCG GCC CTT CGA TGG TAC CTG GCC ACA GGG GAC GAC ATC TAG GAC GTG 308 
Ala Ala Leu Arg Trp Tyr Leu Ala Thr Gly Asp Asp lie Tyr Asp Val 
55 60 65 70 

CCG CAC ATC CGG CAC GTC CAC GCC AAC GGG ACG CTG CAG CTC TAC CCC 356 
Pro His He Arg His Val His Ala Asn Gly Thr Leu Gin Leu Tyr Pro 
75 80 85 

TTC TCC CCC TCC GCC TTC AAT AGC TTT ATC CAC GAC AAT GAC TAC TTC 404 
Phe Ser Pro Ser Ala Phe Asn Ser Phe He His Asp Asn Asp Tyr Phe 
90 95 100 

TGC ACC GCG GAG AAC GCT GCC GGC AAG ATC CGG AGC CCC AAC ATC CGC 452 
Cys Thr Ala Glu Asn Ala Ala Gly Lys He Arg Ser Pro Asn He Arg 
105 110 115 

GTC AAA GCA GTT TTC AGG GAA CCC TAC ACC GTC CGG GTG GAG GAT CAA 500 
Val Lys Ala Val Phe Arg Glu Pro Tyr Thr Val Arg Val Glu Asp Gin 
120 125 130 

AGG TCA ATG CGT GGC AAC GTG GCC GTC TTC AAG TGC CTC ATC CCC TCT 548 
Arg Ser Met Arg Gly Asn Val Ala Val Phe Lys Cys Leu He Pro Ser 
135 140 145 150 

TCA GTG CAG GAA TAT GTT AGC GTT GTA TCT TGG GAG AAA GAC ACA GTC 596 
Ser Val Gin Glu Tyr Val Ser Val Val Ser Trp Glu Lys Asp Thr Val 
155 160 165 

TCC ATC ATC CCA GAA AAC AGG TTT TTT ATT ACC TAC CAC GGC GGG CTG 644 
Ser He He Pro Glu Asn Arg Phe Phe He Thr Tyr His Gly Gly Leu 
170 175 180 

TAC ATC TCT GAC GTA CAG AAG GAG GAC GCC CTC TCC ACC TAT CGC TGC 692 
Tyr He Ser Asp Val Gin Lys Glu Asp Ala Leu Ser Thr Tyr Arg Cys 
185 190 195 

ATC ACC AAG CAC AAG TAT AGC GGG GAG ACC CGG CAG AGC AAT GGG GCA 740 
He Thr Lys His Lys Tyr Ser Gly Glu Thr Arg Gin Ser Asn Gly Ala 
200 205 210 

CGC CTC TCT GTG ACA GAC CCT GCT GAG TCG ATC CCC ACC ATC CTG GAT 788 
Arg Leu Ser Val Thr Asp Pro Ala Glu Ser He Pro Thr He Leu Asp 
215 220 225 230 

GGC TTC CAC TCC CAG GAA GTG TGG GCC GGC CAC ACC GTG GAG CTG CCC 836 
Gly Phe His Ser Gin Glu Val Trp Ala Gly His Thr Val Glu Leu Pro 
235 240 245 

TGC ACC GCC TCG GGC TAC CCT ATC CCC GCC ATC CGC TGG CTC AAG GAT 884 
Cys Thr Ala Ser Gly Tyr Pro He Pro Ala He Arg Trp Leu Lys Asp 
250 255 260 

GGC CGG CCC CTC CCG GCT GAC AGC CGC TGG ACC AAG CGC ATC ACA GGG 932 
Gly Arg Pro Leu Pro Ala Asp Ser Arg Trp Thr Lys Arg He Thr Gly 
265 270 275 

CTG ACC ATC AGC GAC TTG CGG ACC GAG GAC AGC GGC ACC TAC ATT TGT 980 
Leu Thr He Ser Asp Leu Arg Thr Glu Asp Ser Gly Thr Tyr He CyB 
280 285 290 

GAG GTC ACC AAC ACC TTC GGT TCG GCA GAG GCC ACA GGC ATC CTC ATG 1028 
Glu Val Thr Asn Thr Phe Gly Ser Ala Glu Ala Thr Gly He Leu Met 
295 300 305 310 

GTC ATT GAT CCC CTT CAT GTG ACC CTG ACA CCA AAG AAG CTG AAG ACC 1076 
Val He Asp Pro Leu His Val Thr Leu Thr Pro Lys Lys Leu Lys Thr 
315 320 325 
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GGC ATT GGC AGC ACG GTC ATC CTC TCC TGT GCC CTG ACG GGC TCC CCA 1124 

Gly lie Gly Ser Thr Val lie Leu Ser Cys Ala Leu Thr Gly Ser Pro 
330 335 340 

GAG TTC ACC ATC CGC TGG TAT CGC AAC ACG GAG CTG GTG CTG CCT GAC 1172 
Glu Phe Thr lie Arg Trp Tyr Arg Asn Thr Glu Leu Val Leu Pro Asp 
345 350 355 

GAG GCC ATC TCC ATC CGT GGG CTC AGC AAC GAG ACG CTG CTC ATC ACC 1220 
Glu Ala lie Ser lie Arg Gly Leu Ser Asn Glu Thr Leu Leu lie Thr 
360 365 370 

TCG GCC CAG AAG AGC CAT TCC GGG GCC TAC CAG TGC TTC GCT ACC CGC 12 68 

Ser Ala Gin Lys Ser His Ser Gly Ala Tyr Gin Cys Phe Ala Thr Arg 
375 380 385 390 

AAG GCC CAG ACC GCC CAG GAC TTT GCC ATC ATT GCA CTT GAG GAT GGC 1316 
Lys Ala Gin Thr Ala Gin Asp Phe Ala lie lie Ala Leu Glu Asp Gly 
395 400 405 

ACG CCC CGC ATC GTC TCG TCC TTC AGC GAG AAG GTG GTC AAC CCC GGG 1364 
Thr Pro Arg lie Val Ser Ser Phe Ser Glu Lys Val Val Asn Pro Gly 
410 415 420 

GAG CAG TTC TCA CTG ATG TGT GCG GCC AAG GGC GCC CCG CCC CCC ACG 1412 
Glu Gin Phe Ser Leu Met Cys Ala Ala Lys Gly Ala Pro Pro Pro Thr 
425 430 435 

GTC ACC TGG GCC CTC GAC GAT GAG CCC ATC GTG CGG GAT GGC AGC CAC 1460 
Val Thr Trp Ala Leu Asp Asp Glu Pro lie Val Arg Asp Gly Ser His 
440 445 450 

CGC ACC AAC CAG TAC ACC ATG TCG GAC GGC ACC 1493 
Arg Thr Asn Gin Tyr Thr Met Ser Asp Gly Thr 
455 460 465 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 462 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



Met 


Trp 


Leu 


Val 


Thr 


Phe 


Leu 


Leu 


Leu 


Leu 


Asp 


Ser 


Leu 


His 


Lys 


Ala 


1 






5 










10 










15 




Arg 


Pro 


Glu 


Asp 


Val 


Gly 


Thr 


Ser 


Leu 


Tyr 


Phe 


Val 


Asn 


Asp 


Ser 


Leu 






20 








25 










30 






Gin 


Gin 


Val 
35 


Thr 


Phe 


Ser 


Ser 


Ser 
40 


Val 


Gly 


Val 


Val 


Val 
45 


Pro 


Cys 


Pro 


Ala 


Ala 


Gly 


Ser 


Pro 


Ser 


Ala 


Ala 


Leu 


Arg 


Trp 


Tyr 


Leu 


Ala 


Thr 


Gly 




50 








55 










60 










Asp 


Asp 


lie 


Tyr 


Asp 


Val 


Pro 


His 


He 


Arg 


His 


Val 


His 


Ala 


Asn 


Gly 


65 










70 










75 










80 


Thr 


Leu 


Gin 


Leu 


Tyr 
85 


Pro 


Phe 


Ser 


Pro 


Ser 
90 


Ala 


Phe 


Asn 


Ser 


Phe 
95 


He 


His 


Asp 


Asn 


Asp 


Tyr 


Phe 


Cys 


Thr 


Ala 


Glu 


Asn 


Ala 


Ala 


Gly Lys 


He 








100 










105 










110 






Arg 


Ser 


Pro 


Asn 


lie 


Arg 


Val 


Lys 


Ala 


Val 


Phe 


Arg 


Glu 


Pro 


Tyr 


Thr 




115 










120 










125 








Val 


Arg 


Val 


Glu 


Asp 


Gin 


Arg 


Ser 


Met 


Arg 


Gly 


Asn 


Val 


Ala 


Val 


Phe 




130 








135 










140 










Lys 


Cys 


Leu 


lie 


Pro 


Ser 


Ser 


Val 


Gin 


Glu 


Tyr 


Val 


Ser 


Val 


Val 


Ser 


145 








150 










155 










160 


Trp 


Glu 


Lys 


Asp 


Thr 


Val 


Ser 


lie 


He 


Pro 


Glu 


Asn 


Arg 


Phe 


Phe 


He 




165 










170 










175 
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Thr 


Tyr 


His 


Gly 


Gly 


Leu 








180 






Leu 


Ser 


Thr 


Tyr 


Arg 


Cys 






195 








Arg 


Gin 


Ser 


Asn 


Gly 


Ala 




210 










He 


Pro 


Thr 


He 


Leu 


Asp 


225 










230 


His 


Thr 


Val 


Glu 


Leu 


Pro 










245 




He 


Arg 


Trp 


Leu 


Lys 


Asp 








260 






Thr 


Lvs 


Arq 


lie 


Thr 


Gly 






275 








Ser 


Gly 


Thr 


Tyr 


He 


Cys 




290 










Ala 


Thr 


Gly 


He 


Leu 


Met 


305 








310 


Pro 


Lvs 


Lys 


Leu 


Lys 


Thr 








325 




Ala 


Leu 


Thr 


Gly 


Ser 


Pro 








340 






Glu 


Leu 


Val 


Leu 


Pro 


Asp 






355 








Glu 


Thr 


Leu 


Leu 


He 


Thr 




370 










Gin 


Cys 


Phe 


Ala 


Thr 


Arg 


385 








390 


lie 


Ala 


Leu 


Glu 


Asp 


Gly 










405 




Lys 


Val 


Val 


Asn 


Pro 


Gly 






420 






Gly 


Ala 


Pro 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 605 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 615 amino acids 

(B) TYPE: amino acid 
( d ) TOPOLOGY : linear 



<ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 611 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ IDNO:10: 
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Met 


He 


Asp 


Arg 


Lys 


Pro 


225 










230 










235 










240 


Arg 


Leu 


Leu 


Phe 


Pro 


Thr 


Asn 


Ser 


Ser 


Ser 


His 


Leu 


Val 


Ala 


Leu 


Gin 








245 










250 










255 




Gly Gin 


Pro 


Leu 


Val 


Leu 


Glu 


Cys 


He 


Ala 


Glu 


Gly 


Phe 


Pro 


Thr 


Pro 








260 








265 








270 






Thr 


He 


Lys 
275 


Trp 


Leu 


Arg 


Pro 


Ser 
280 


Gly 


Pro 


Met 


Pro 


Ala 
285 


Asp 


Arg 


Val 


Thr 


Tyr 


Gin 


Asn 


His 


Asn 


Lys 


Thr 


Leu 


Gin 


Leu 


Leu 


Lys 


Val 


Gly 


Glu 




290 










295 










300 








Glu 


Asp 


Asp 


Gly 


Glu 


Tyr 


Arg 


Cys 


Leu 


Ala 


Glu 


Asn 


Ser 


Leu 


Gly 


Ser 


305 










310 










315 










320 


Ala 


Arg 


His 


Ala 


Tyr 


Tyr 


Val 


Thr 


Val 


Glu 


Ala 


Ala 


Lys 


Tyr 


Arg 


He 








325 










330 










335 




Gin 


Arg 


Gly 


Ala 


Leu 


He 


Leu 


Ser 


Asn 


Val 


Gin 


Pro 


Ser 


Asp 


Thr 


Met 



340 345 350 
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val 


Thr 


Gin 
355 


Cys 


Glu 


Ala 


Arg 


Asn 
360 


Arg 


Ala 


Tyr 
370 


lie 


Tyr 


Val 


Val 


Gin 
375 


Leu 


Pro 


Asn 


Gin 


Thr 


Tyr 


Met 


Ala 


Val 


Pro 


Tyr 


385 










390 








His 


Leu 


Tyr 


Gly 


Pro 
405 


Gly 


Glu 


Thr 


Ala 


Gly 


Arg 


Pro 


Gin 
420 


Pro 


Glu 


Val 


Thr 


Trp 
425 


Glu 


Glu 


Leu 
435 


Ala 


Lys 


Asp 


Gin 


Gin 
440 


Gly 


Lys 


Ala 


Phe 


Gly 


Ala 


Pro 


Val 


Pro 


Ser 


450 










455 






Gly 


Thr 


Thr 


Val 


Leu 


Gin 


Asp 


Glu 


Arg 


465 










470 








Thr 


Leu 


Gly 


lie 


Arg 


Asp 


Leu 


Gin 


Ala 








485 










Cys 


Leu 


Ala 


Ala 


Asn 


Asp 


Gin 


Asn 


Asn 






500 










505 


Lys 


Val 


Lys 


Asp 


Ala 


Thr 


Gin 


He 


Thr 




515 










520 




Glu 


Lys 
530 


Lys 


Gly 


Ser 


Arg 


Val 
535 


Thr 


T - * V» M 

Phe 


Pro 


Ser 


Leu 


Gin 


Pro 


Ser 


He 


Thr 


Trp 


545 










550 








Gin 


Glu 


Leu 


Gly 


Asp 
565 


Ser 


Asp 


Lys 


Tyr 


Val 


lie 


His 


Ser 
580 


Leu 


Asp 


Tyr 


Ser 


Asp 
585 


Ala 


Ser 


Thr 


Glu 


Leu 


Asp 


Val 


Val 


Glu 



595 600 



Val Gly Ser 
610 
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His 


Gly 


Leu 


Leu 
365 


Leu 


Ala 


Asn 


Ala 


Lys 


He 
380 


Leu 


Thr 


Ala 


Asp 


Trp 


Leu 


His 


Lys 


Pro 


Gin 


Ser 


395 










400 


Arg 


Leu 


Asp 


Cys 


Gin 


Val 


Gin 


410 










415 




Arg 


He 


Asn 


Gly 


He 


Pro 


Val 








430 






Ser 


Thr 


Ala 


Tyr 


Leu 


Leu 


Cys 








445 






Val 


Gin 


Trp 
460 


Leu 


Asp 


Glu 


Asp 


Phe 


Phe 
475 


Pro 


Tyr 


Ala 


Asn 


Gly 
480 


Asn 


Asp 


Thr 


Gly 


Arg 


Tyr 


Phe 


490 










495 




Val 


Thr 


He 


Met 


Ala 
510 


Asn 


Leu 


Gin 


Gly 


Pro 


Arg 


Ser 


Thr 


He 






525 








Thr 


Cys 


Gin 
540 


Ala 


Ser 


Phe 


Asp 


Arg 


Gly 
555 


Asp 


Gly 


Arg 


Asp 


Leu 
560 


Phe 


He 


Glu 


Asp 


Gly 


Arg 


Leu 


570 










575 




Gin 


Gly 


Asn 


Tyr 


Ser 
590 


Cys 


Val 


Ser 


Arg 


Ala 


Gin 


Leu 


Leu 


Val 



605 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 612 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



Met 


Met 


Lys 


Glu 


Lys 


Ser 


He 


Ser 


Ala 


Ser 


Lys 


Ala 


Ser 


Leu 


Val 


Phe 


1 






5 










10 










15 




Phe 


Leu 


Cys 


Gin 


Met 


He 


Ser 


Ala 


Leu 


Asp 


Val 


Pro 


Leu 


Asp 


Ser 


Lys 






20 










25 










30 






Leu 


Leu 


Glu 
35 


Glu 


Leu 


Ser 


Gin 


Pro 
40 


Pro 


Thr 


He 


Thr 


Gin 
45 


Gin 


Ser 


Pro 


Lys 


Asp 


Tyr 


He 


Val 


Asp 


Pro 


Arg 


Glu 


Asn 


He 


Val 


He 


Gin 


Cys 


Glu 


50 








55 










60 










Ala 


Lys 


Gly 


Lys 


Pro 


Pro 


Pro 


Ser 


Phe 


Ser 


Trp 


Thr 


Arg 


Asn 


Gly 


Thr 


65 






70 










75 










80 


His 


Phe 


Asp 


He 


Asp 


Lys 


Asp 


Ala 


Gin 


Val 


Thr 


Met 


Lys 


Pro 


Asn 


Ser 








85 








90 










95 




Gly 


Thr 


Leu 


Val 


Val 


Asn 


He 


Met 


Asn 


Gly 


Val 


Lys 


Ala 


Glu 


Ala 


Tyr 






100 










105 










110 






Glu 


Gly 


Val 


Tyr 


Gin 


Cys 


Thr 


Ala 


Arg 


Asn 


Glu 


Arg 


Gly 


Ala 


Ala 


He 




115 






120 










125 








Ser 


Asn 
130 


Asn 


He 


Val 


He 


Arg 
135 


Pro 


Ser 


Arg 


Ser 


Pro 
140 


Leu 


Trp 


Thr 


Lys 


Glu 


Lys 


Leu 


Glu 


Pro 


Asn 


His 


Val 


Arg 


Glu 


Gly 


Asp 


Ser 


Leu 


Val 


Leu 


145 








150 








155 










160 


Asn 


Cys 


Arg 


Pro 


Pro 


Val 


Gly 


Leu 


Pro 


Pro 


Pro 


lie 


He 


Phe 


Trp 


Met 



165 170 175 
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Asp 


Asn 


Ala 


Pne 


Gin 


Arg 






loU 






Leu 


Asn 


Gly 


Asp 


Leu 


Tyr 






1 Q C 

iy b 








vai 


Asp 


Tyr 


lie 


Cys 


Tyr 




Z. 1U 










Gin 


Ly s 


Gin 


Pro 


Tip 




b 








230 


Glu 


Arg 


Pro 


Pro 


Val 


Leu 








OAK 




Val 


Glu 


T nil 

Leu 


Arg 




Asn 














Leu 


Pro 


Thr 


Pro 


vai 


X 1c 






*5 *7 








Ala 


Asn 


Arg 


x nr 


rue 
















Asp 


val 


Ser 


VjIU 


nla 


ASp 












jIU 


Thr 


Leu 


Giy 


Ser 


i nr 


nis 












Pro 


Tyr 


Trp 


lie 


inr 


Ala 














Asp 


Gly 


Thr 


Leu 


Tin 

lie 


Cys 






"3 C C 

J bo 








Ser 


Trp 


Leu 


Thr 


Asn 


Giy 




*a *7 o 










Ser 


Arg 


Lys 


vai 


Asp 


Giy 


oat 

Jab 










ion 


Arg 


Ser 


Ser 


Ala 


Val 


Tyr 








4Ub 




Leu 


Leu 


Ala 


Asn 


Ala 


Pne 














Leu 


Thr 


Pro 


Ala 


Asn 


Lys 






435 








Leu 


lie 


Asp 


Cys 


Ala 


Tyr 














Pne 


Arg 


Gly 


vai 


Lys 


Giy 












at n 


His 


Asp 


Asn 


Gly 


inr 


Leu 










4ob 




Gly 


Thr 


Tyr 


Thr 


Cys 


vai 








bUU 






Glu 


vai 


Gin 


Leu 


VjIU 


Val 






bib 








bin 


Tyr 


Lys 


vai 


lie 


bin 




oJL) 










lie 


Lys 


His 


Asp 


Pro 


Thr 


545 










550 


Asn 


Asn 


Glu 


Leu 


Pro 


Asp 










565 


Leu 


Thr 


lie 


Met 


Asn 


Val 








580 






lie 


Val 


Asn 


Thr 


Thr 


Leu 






595 








Val 


Val 


Ala 


Ala 







610 
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Leu 


Pro 


Gin 


Ser 


VjIU 


Arg 






lob 








"D K /-^ 
rue 


Ser 


Asn 


val 


bin 


Pro 




200 










Ala 


Arg 


Arne 


Asn 


His 


Thr 


z lb 








220 


Val 


Lys 


V al 


ir nt= 


Ser 


Thr 








235 




Leu 


i nr 


Pro 


PlcL 


Gly 


Ser 












V ell 


Leu 


Leu 


Leu 


Glu 


Cys 






4& □ 3 








Arg 


Trp 


Tip 

1 1c 


Lys 


Glu Gly 














blU 


Asn 


xr lie 


Lys 


Lys 


Thr 










300 


Ser 


biy 


Asn 


Tyr 


Lys 


Cys 










315 




rllS 


vai 


lie 


Ser 


Val 


Thr 








Tin 






Pro 


Arg 


Asn 


Leu 


val 


Leu 












Arg 


Ala 


Asn 


uiy 


Asn 


Pro 




o ou 










vai 


Pro 


Tin 

i ie 


TV 1 n 

Ala 


lie 


Ala 


J /b 










380 


Asp 


Tnr 


Tin 

l ie 


lie 


Phe 


Ser 








395 




Gin 


Cys 


Asn 


Ala 


Ser 


Asn 






410 






val 


Asn 


vai 


Leu 


Ala 
ma a 


Glu 






A*5 C 








Leu 


Tyr 


Gin 


vai 


lie 


Ala 




a An 










T3 V> ^ 

rile 


Giy 


Ser 


Pro 


Lys 


Pro 


4bb 










460 


Ser 


lie 


Leu 


Arg 


Gly 


Asn 










475 




blu 


i ie 


Pro 


Val 


Ala 


Gin 








Aon 






Ala 


Arg . 


Asn 


Lys 


Leu 


Gly 




505 








Lys 


ASp 


JTX. \J 


Thy 
i in 


Met 


lie 




520 










Arg 


Ser 


Ala 


Gin 


Ala 


Ser 


535 










540 


Leu 


lie 


Pro 


Thr 


Val 


lie 










555 




Asp 


Glu 


Arg 


Phe 


Leu 


Val 








570 






Thr 


Asp 


Lys 


Asp 


Asp 


Gly 






585 








Asp 


Ser 


Val 


Ser 


Ala 


Ser 



600 
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Val 


Ser 


Gin 


Gly 




190 






Glu 


A QT* 


X XIX 


AX y 


205 








Gin 


Thr 

-L i IX 


X XC5 


Gin 


Lys 


Pro 


Val 


Thr 








240 


Thr 


Ser 


Asn 


Lys 






255 




lie 


Ala 


Ala 


Glv 




270 






Glv 
vriy 


Glu 


Leu 


Pro 


285 








Leu 


Lys 


lie 


lie 


Thr 


Ala 


A ttt 


nan 








320 


Val 


Lys 


Ala 


Ala 










Ser 


Pro 


Gly 


Glu 




350 






Lys 


Pro 


Ser 


lie 


365 








Pro 


Glu 


Asp 


Pro 


Al a 


v ax 


m n 


v?x u 








400 


Glu 


Tyr 


Gly 


Tyr 






AIR 




XT X. U 


X? X \J 


nx^ 


Tip 

X xts 




430 






A cn 


C a y« 

OCX 




Al a 
Axa 


445 








vv XU 


Tip 
lie 


IvXU 


Trp 


Glu 


lyr 


Va 1 
v ax 


Php 
rile 








480 


Lys 


Asp 


Ser 


Thr 






495 




Lys 


Thr 


Gin 


Asn 


510 






lie 


Lys 


Gin 


Pro 


525 








Phe 


Glu 


Cvs 


Val 


Trp 


Leu 


Lys 


Asp 








560 


Gly 


Lys 


Asp 


Asn 






575 




Thr 


Tyr 


Thr 


Cys 




590 






Ala 


Val 


Leu 


Thr 



605 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 607 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gly Thr Ala Thr Arg Arg Lys Pro His Leu Leu Leu Val Ala Ala 

15 10 15 

Val Ala Leu Val Ser Ser Ser Ala Trp Ser Ser Ala Leu Gly Ser Gin 

20 25 30 

Thr Thr Phe Gly Pro Val Phe Glu Asp Gin Pro Leu Ser Val Leu Phe 

35 40 45 

Pro Glu Glu Ser Thr Glu Glu Gin Val Leu Leu Ala Cys Arg Ala Arg 

50 55 60 

Ala Ser Pro Pro Ala Thr Tyr Arg Trp Lys Met Asn Gly Thr Glu Met 
65 70 75 80 

Lys Leu Glu Pro Gly Ser Arg His Gin Leu Val Gly Gly Asn Leu Val 

85 90 95 

lie Met Asn Pro Thr Lys Ala Gin Asp Ala Gly Val Tyr Gin Cys Leu 

100 105 110 

Ala Ser Asn Pro Val Gly Thr Val Val Ser Arg Glu Ala lie Leu Arg 

115 120 125 

Phe Gly Phe Leu Gin Glu Phe Ser Lys Glu Glu Arg Asp Pro Val Lys 

130 135 140 

Ala His Glu Gly Trp Gly Val Met Leu Pro Cys Asn Pro Pro Ala His 
145 150 155 160 

Tvr Pro Gly Leu Ser Tyr Arg Trp Leu Leu Asn Glu Phe Pro Asn Phe 

* 165 170 175 

lie Pro Thr Asp Gly Arg His Phe Val Ser Gin Thr Thr Gly Asn Leu 

i80 185 190 

Tyr lie Ala Arg Thr Asn Ala Ser Asp Leu Gly Asn Tyr Ser Cys Leu 

195 200 205 

Ala Thr Ser His Met Asp Phe Ser Thr Lys Ser Val Phe Ser Lys Phe 

210 215 220 

Ala Gin Leu Asn Leu Ala Ala Glu Asp Thr Arg Leu Phe Ala Pro Ser 
225 230 235 240 

He Lys Ala Arg Phe Pro Ala Glu Thr Tyr Ala Leu Val Gly Gin Gin 

245 250 255 

Val Thr Leu Glu Cys Phe Ala Phe Gly Asn Pro Val Pro Arg He Lys 

260 265 270 

Trp Arg Lys Val Asp Gly Ser Leu Ser Pro Gin Trp Thr Thr Ala Glu 

275 .280 285 

Pro Thr Leu Gin lie Pro Ser Val Ser Phe Glu Asp Glu Gly Thr Tyr 

290 295 300 

Glu Cys Glu Ala Glu Asn Ser Lys Gly Arg Asp Thr Val Gin Gly Arg 
305 310 315 320 

lie He Val Gin Ala Gin Pro Glu Trp Leu Lys Val He Ser Asp Thr 

325 330 335 

Glu Ala Asp He Gly Ser Asn Leu Arg Trp Gly Cys Ala Ala Ala Gly 

340 345 350 

Lys Pro Arg Pro Thr Val Arg Trp Leu Arg Asn Gly Glu Pro Leu Ala 

355 360 365 

Ser Gin Asn Arg Val Glu Val Leu Ala Gly Asp Leu Arg Phe Ser Lys 

370 375 380 

Leu Ser Leu Glu Asp Ser Gly Met Tyr Gin Cys Val Ala Glu Asn Lys 
385 390 395 400 

His Gly Thr He Tyr Ala Ser Ala Glu Leu Ala Val Gin Ala Leu Ala 

405 410 415 

Pro Asp Phe Arg Leu Asn Pro Val Arg Arg Leu He Pro Ala Ala Arg 

420 425 430 

Gly Gly Glu He Leu He Pro Cys Gin Pro Arg Ala Ala Pro Lys Ala 

435 440 445 

Val Val Leu Trp Ser Lys Gly Thr Glu He Leu Val Asn Ser Ser Arg 

450 455 460 

Val Thr Val Thr Pro Asp Gly Thr Leu He He Arg Asn He Ser Arg 
465 470 475 480 

Ser Asp Glu Gly Lys Tyr Thr Cys Phe Ala Glu Asn Phe Met Gly Lys 

485 490 495 

Ala Asn Ser Thr Gly He Leu Ser Val Arg Asp Ala Thr Lys He Thr 

500 505 510 

Leu Ala Pro Ser Ser Ala Asp He Asn Leu Gly Asp Asn Leu Thr Leu 
515 520 525 
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Pin 


Cys 




Al a 

AX CL 






71 C T~» 
£\ O 




X J IX 




A cn 


Leu 


Thr 

X I1X 




Thr 

X IlX. 


Trp 




530 










535 








540 








Thr 


Leu 


Asp 


Asp 


Phe 


Pro 
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Asp 


Phe 


Asp 


Lys 


Pro 


Gly 


Gly 


His 


Tyr 


545 










550 










555 










560 


Arg 


Arg 


Thr 


Asn 


Val 
565 


Lys 


Glu 


Thr 


lie 


Gly 
570 


Asp 


Leu 


Thr 


lie 


Leu 
575 


Asn 


Ala 


Gin 


Leu 


Arg 
580 


His 


Gly 


Gly 


Lys 


Tyr 
585 


Thr 


Cys 


Met 


Ala 


Gin 
590 


Thr 


Val 


Val 


Asp 


Ser 
595 


Ala 


Ser 


Lys 


Glu 


Ala 
600 


Thr 


Val 


Leu 


Val 


Arg 
605 


Gly 


Pro 





(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 596 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 





Leu 


OCX 


xrp 


xiy S3 


Gin 


Leu 


lie 


Leu 


Leu 


Ser 


php 

XT 11C 


Tip 


Gly 




Leu 


J. 








c 

3 










XU 










-L 3 




Ala 


Var Xy 


f 1 11 
uXU 


Leu 


Leu 


L6U 


Gin 


uiy 


Pro 


v ax 


rile 


v ax 


Lys 


\y XU 


Pro 


Ser 




























Asn 


Ser 


lie 


Phe 


Pro 


Val 


Glv 
v»xy 


Ser 


Glu 


Asp 


Lys 


Lys 


He 


Thr 


Leu 


Asn 






3 3 














■» -j 








Cys 


Glu 


Ala 


Arg 


Glv 


Asn 


Pro 


Ser 


Pro 


His 


Tvr 


Arg 


XX±J 


Gin 


Leu 


Asn 














55 










60 










Glv 
v»xy 


Ser 


Asp 


lie 


Asp 


Thr 


Ser 


Leu 


Asp 


His 


Arg 


Tvr 
xy x 


Lys 


Leu 


Asn 


Gly 


65 










70 










75 










80 


Glv 


Asn 


Leu 


lie 


Val 


lie 


Asn 


Pro 


Asn 


Arc 


Asn 


Tro 

Xt 


A st> 


Thr 


Glv 


Ser 










85 










90 










95 




Tyr 


Gin 


Cys 


Phe 


Ala 


Thr 


Asn 


Ser 


Leu 


Gly 


Thr 


He 


Val 


Ser 


Arg 


Glu 








100 










105 










110 






Ala 


Lys 


Leu 


Gin 


Phe 


Ala 


Tyr 


Leu 


Glu 


Asn 


Phe 


Lys 


Ser 


Arg 


Met 


Arg 






115 










120 










125 






Ser 


Arg 


Val 


Ser 


Val 


Arg 


Glu 


Gly 


Gin 


Gly 


Val 


Val 


Leu 


Leu 


Cys 


Gly 




130 










135 










140 










Pro 


Pro 


Pro 


His 


Ser 


Gly 


Glu 


Leu 


Ser 


Tyr 


Ala 


Trp 


Val 


Phe 


Asn 


Glu 


145 










150 










155 










160 


Tyr 


Pro 


Ser 


Phe 


Val 


Glu 


Glu 


Asp 


Ser 


Arg 


Arg 


Phe 


Val 


Ser 


Gin 


Glu 










165 










170 










175 




Thr 


Gly 


His 


Leu 


Tyr 


lie 


Ala 


Lys 


Val 


Glu 


Pro 


Ser 


Asp 


Val 


Gly 


Asn 








180 










185 
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Tyr 


Thr 


Cys 


Val 


Val 


Thr 


Ser 


Thr 


Val 


Thr 


Asn 


Ala 


Arg 


Val 


Leu 


Gly 






195 










200 










205 






Ser 


Pro 


Thr 


Pro 


Leu 


Val 


Leu 


Arg 


Ser 


Asp 


Gly 


Val 


Met 


Gly 


Glu 


Tyr 




210 










215 










220 










Glu 


Pro 


Lys 


lie 


Glu 


Leu 


Gin 


Phe 


Pro 


Glu 


Thr 


Leu 


Pro 


Ala 


Ala 


Lys 


225 










230 










235 










240 
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Ser 


Thr 


Val 


Lys 


Leu 


Glu 


Cys 


Phe 


Ala 


Leu 


Gly 


Asn 


Pro 


Val 


Pro 
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Met 


Pro 
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Pro 
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He 








260 










265 
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Arg 
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Tyr 


Arg 


Trp 
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Lys 








340 
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Val 
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Glu 


Glu 


Arg 


He 


Gin 


He 


Glu 


Asn 


Gly 



355 360 365 
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Leu 
370 


Thr 


He 


Ala 
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He 
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Glu 
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Lys 


385 
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Lys 


Val 
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Ala 
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Ala 








405 




Met 


He 
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Val 
420 
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Val 
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Ala 


Ser 
435 
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Ala 


Val 


Arg 
450 


Glu 
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Ala 


Arg 


He 


Met 
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Val 


Thr 


Lys 


465 










470 


Glu 
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Phe 


Gly 
485 


Lys 


Glu 


Pro 


Thr 


Arg 
500 


He 


He 


Gly 


Glu 


Ser 


He 


He 


Leu 




515 








Asp 


He 


Met 


Phe 


Ala 


Trp 


530 










Lys 


Asp 


Gly 


Ser 


His 


Phe 


545 










550 


Leu 


Met 


He 


Arg 


Asn 
565 


He 


Met 


Val 


Gin 


Thr 
580 


Gly 


Val 


Val 


Arg 


Gly 
595 


Ser 
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Leu 


Asn 


Val 


Ser 


Asp 


Ser 


375 










380 


His 


Gly 


Leu 


He 


Tyr 
395 


Ser 


Pro 


Asp 


Phe 


Ser 
410 


Arg 


Asn 


Gly 


Ser 


Leu 


Val 


He 


Leu 




425 








Leu 


Ser 
440 


Phe 


Trp 


Lys 


Lys 


He 


Ser 


Leu 


Leu 


Asn 


Asp 


455 










460 


Ala 


Asp 


Ala 


Gly 


He 
475 


Tyr 


Ala 


Asn 


Gly 


Thr 


Thr 


Gin 






490 






Leu 


Ala 


Pro 
505 


Ser 


Asn 


Met 


Pro 


Cys 
520 


Gin 


Val 


Gin 


His 


Tyr 


Phe 


Asn 


Gly 


Thr 


Leu 


535 










540 


Glu 


Lys 


Val 


Gly 


Gly 
555 


Ser 


Gin 


Leu 


Lys 


His 


Ser 


Gly 






570 






Asp 


Ser 


Val 


Ser 


Ser 


Ala 



585 
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Gly 


Met 


Phe 


Gin 


Ser 


Ala 


Glu 


Leu 








400 


Pro 


Met 


Lys 


Lys 






415 




Asp 


Cys 


Lys 


Pro 




430 






Gly 


Asp 


Thr 


Val 


445 








Gly 


Gly 


Leu 


Lys 


Thr 


Cvs 


lie 


Ala 






480 


Leu 


Val 


Val 


Thr 






495 




Asp 


Val 


Ala 


Val 




510 






Asp 


Pro 


Leu 


Leu 


525 








Thr 


Asp 


Phe 


Lys 


Ser 


Ser 


Gly 


Asp 








560 


Lys 


Tyr 


Val 


Cys 






575 




Ala 


Glu 


Leu 


He 



590 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 630 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



Met 


val 


Leu 


His 


Ser 


His 


Gin 


Leu 


Thr 


Tyr 


Ala 


Gly 


He 


Ala 


Phe 


Ala 


1 








5 










10 










15 




Leu 


Cys 


Leu 


His 


His 


Leu 


He 


Ser 


Ala 


He 


Glu 


Val 


Pro 


Leu 


Asp 


Ser 






20 










25 










30 






Asn 


He 


Gin 
35 


Ser 


Glu 


Leu 


Pro 


Gin 
40 


Pro 


Pro 


Thr 


He 


Thr 
45 


Lys 


Gin 


Ser 


Val 


Lys 


Asp 


Tyr 


He 


Val 


Asp 


Pro 


Arg 


Asp 


Asn 


He 


Phe 


He 


Glu 


Cys 




50 






55 










60 










Glu 


Ala 


Lys 


Gly 


Asn 


Pro 


Val 


Pro 


Thr 


Phe 


Ser 


Trp 


Thr 


Arg 


Asn 


Gly 


65 






70 










75 










80 


Lys 


Phe 


Phe 


Asn 


Val 


Ala 


Lys 


Asp 


Pro 


Lys 


Val 


Ser 


Met 


Arg 


Arg 


Arg 








85 










90 










95 




Ser 


Gly 


Thr 


Leu 


Val 


He 


Asp 


Phe 


His 


Gly 


Gly 


Gly 


Arg 


Pro 


Asp 


Asp 






100 










105 










110 






Tyr 


Glu 


Gly 


Glu 


Tyr 


Gin 


Cys 


Phe 


Ala 


Arg 


Asn 


Asp 


Tyr 


Gly 


Thr 


Ala 




115 








120 










125 








Leu 


Ser 


Ser 


Lys 


He 


His 


Leu 


Gin 


Val 


Ser 


Arg 


Ser 


Pro 


Leu 


Trp 


Pro 




130 








135 










140 










Lys 


Glu 


Lys 


Val 


Asp 


Val 


He 


Glu 


Val 


Asp 


Glu 


Gly 


Ala 


Pro 


Leu 


Ser 


145 






150 










155 










160 


Leu 


Gin 


Cys 


Asn 


Pro 


Pro 


Pro 


Gly 


Leu 


Pro 


Pro 


Pro 


Val 


He 


Phe 


Trp 








165 










170 










175 




Met 


Ser 


Ser 


Ser 
180 


Met 


Glu 


Pro 


He 


His 
185 


Gin 


Asp 


Lys 


Arg 


Val 
190 


Ser 


Gin 


Gly 


Gin 


Asn 


Gly 


Asp 


Leu 


Tyr 


Phe 


Ser 


Asn 


Val 


Met 


Leu 


Gin 


Asp 


Ala 




195 








200 










205 
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Gin 


Thr 


Asp 


Tyr 


Ser 


Cys 




21U 










Gin 


Gin 


Lys 


Asn 


Pro 


Tyr 


1 1 c 












Asn 


Glu 


Thr 


Ser 


Leu 


Arg 










O A C 




Val 


Thr 


Glu 


Thr 


|T» V-» v> 


Pro 














Ser 


Gin 


Met 


IT- 1 

Val 


Leu 


Arg 






275 








Ser 


Gly 


val 


Pro 


Aia 


Pro 














Leu 


Pro 


Ala 


Gly 


Lys 


Thr 


305 










310 


lie 


Ser 


Asn 


Val 


Ser 


Glu 










QIC 




Ser 


Asn 


Lys 


Met 


Gly 


Ser 






340 






Ala 


Ala 


Pro 


Tyr 


Trp 


Leu 






? c c 








Gly 


Glu 


Asp 


Gly 


Arg 


Leu 




370 










Ser 


T 1 « 

lie 


Gin 


Trp 


Leu 


Veil 


*3 o tr 
Job 












Asn 


Pro 


Ser 


Arg 


Glu 


it- 1 

vai 










4Ub 




Gin 


lie 


Gly 


Ser 


Ser 


Ala 












Gly 


Tyr 


Leu 


Leu 


Ala 


Asn 






A *3 C 








Arg 


lie 


Leu 


Ala 


Pro 


Arg 




450 










Arg 


Thr 


Arg 


Leu 


Asp 


Cys 


465 










47U 


Arg 


Trp 


Phe 


Lys 


Asn 


Gly 










4ob 




Lys 


Ala 


His 


Glu 


Asn 


Gly 






bUU 






Asp 


Gin 


Gly 


He 


Tyr 


Thr 






CI c 








Glu 


Ala 


Gin 


Val 


Arg 


Leu 




530 










Gly 


Pro 


Glu 


Asp 


Gin 


vai 


545 










bbO 


Cys 


Arg 


IT- i 

val 


Lys 


HIS 


Asp 










565 




Lys 


Asp 


Asp 


Ala 


Pro 


Leu 








580 






Asp 


Gly 


Leu 


Thr 


He 


Tyr 






595 








Thr 


Cys 


Val 


Ala 


Ser 


Thr 




610 










Leu 


Thr 


Val 


Leu 


Ala 


He 


625 










630 
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Asn 


Ala 


Arg 


O K 0 




Jr I Its 


91 c 

a X 3 










220 


rnU — 

j. nr 


Leu 


Ly s 


vax 


Lys 


-L 1 IX 










235 




Asn 


U4 a 
nXa 


x nr 


Asp 


net 


Tyr 








PRO 






c 0 r 
Ocl 




ncu 


xyr 




iyr 
















val 


Asp 


Leu 


Leu 


Leu 














Asp 


T 1 0 
X lc 


wet 


Trp 


xyr 


Lys 














Lys 


Leu 


G1U 


Asn 


rne 


Asn 








11 c 
J13 




U1U 


Asp 


Ser 


Giy 


blu 


Tyr 








Jju 






xie 


Arg 


HIS 


1 nr 


Tl Q 

x xe 


Ser 




.3 4 3 








Asp 


bill 


Pro 


Gxn 


Asn 


Leu 












va jl 


Cys 


Arg 


ax a 


Asn 


uiy 












*?fin 


Asn 


uiy 


GXU 


Pro 


x xe 


ulli 








qqc 

•J 3 3 




Axa 


biy 


Asp 


x nr 


Tl Q 

x xe 


Vq1 








** JL\J 






vai 


Tyr 


Gin 


Cys 


Asn 


AXcL 
















irne 


vax 




vax 


Leu 




44U 










Asn 


Gin 


Leu 


Tl « 

xxe 


Lys 


vax 


4bb 










A£n 


Pro 


Pne 


pne 


Giy 


Ser 


Pro 








ATC 
*k to 




Gin 


Gly 


Asn 


Wet 


Leu 


Asp 








>i on 

4yu 






Ser 


Leu 


GXU 


ncu 


Ser 








cot; 








Cys 


v a jl 


ax a 


1 nr 


Asn 


Tin 

x xe 


con 










(jIU 


vax 


Lys 


Asp 


Pro 


1. X1XT 












RAH 


V clX 


Lys 


Arg 


uiy 


ser 


Uaf 

net 










555 




Pro 


Thr 


Leu 


Lys 


Leu 


Thr 








570 






Tyr 


He 


Gly 


Asn 


Arg 


Met 






585 








Gly 


Val 


Ala 


Glu 


Lys 


Asp 




600 










Glu 


Leu 


Asp 


Lys 


Asp 


Ser 


615 










620 
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inr 


Hi 0 
flX 3 


x nr 


x xe 


Lys 


Lys 


Pro 


His 








240 


Ser 


Ala 


Arg 


vf xy 






ncc 






Thr 
x 1. ix. 


Ser 


Ser 




270 








Cys 


x xe 


ax a 










Lys 


uiy 


wiy 


GXU 


Lys 


Ala 


Leu 


Arg 








320 


irne 


Cys 


Leu 


Aia 










V a. -L 


Arg 


vax 


Lys 










x xe 


Leu 


Axa 


Pro 


JD3 








Asn 


Pro 


Lys 


Pro 


Gly 


Ser 


Pro 


Pro 








Ann 


rne 


Arg 


Asp 


inr 






AIR 




Ser 


Asn 


pi,. 


WIS 










Asp 


vai 


Pro 


Pro 










Tl « 

xxe 


bin 


Tyr 


Asn 


lie 


Pro 


Thr 


Leu 








4BO 


pit* 
uiy 


Giy 


Asn 


Tyr 






** ;7 D 




Til a 
Ala 


Arg 


Lys 


GXU 




cin 

9 X w 






Leu 


wiy 


Lys 


vax 


R9 R 








Arg 


xxe 


IT- 1 

vax 


Arg 


Pro 


Arg 


Leu 


His 








560 


Val 


Thr 


Trp 


Leu 






575 




Lys 


Lys 


Glu 


Asp 




590 






Gin 


Gly 


Asp 


Xyr 


605 








Ala 


Lys 


Ala 


Xyr 



WO 98/22491 



PCT7US97/20201 



What is claimed is: 

1. A method for identifying a cDNA nucleic acid 
encoding a mammalian protein having a signal sequence, 
the method comprising: 
5 a) providing library of mammalian cDNA ; 

b) ligating said library of mammalian cDNA to DNA 
encoding alkaline phosphatase lacking both a signal 
sequence and a membrane anchor sequence to form ligated 
DNA; 

10 c) transforming bacterial cells with said ligated 

DNA to create a bacterial cell clone library; 

d) isolating DNA comprising said mammalian cDNA 
from at least one clone in said bacterial cell clone 
library; 

15 e) separately transfecting DNA isolated from 

clones in step (d) into mammalian cells which do not 
express alkaline phosphatase to create a mammalian cell 
clone library wherein each clone in said mammalian cell 
clone library corresponds to a clone in said bacterial 

2 0 cell clone library; 

f) identifying a clone in said mammalian cell 
clone library which express alkaline phosphatase ; 

g) identifying the clone in said bacterial cell 
clone library corresponding to said clone in said 

25 mammalian cell clone library identified in step (f ) ; and 

h) isolating and sequencing a portion of the 
mammalian cDNA present in said bacterial cell library 
clone identified in step (g) to identify a mammalian cDNA 
encoding a mammalian protein having a signal sequence. 

30 2. The method of claim 1 wherein said library of 

mammalian cDNAs are ligated to ptrAP3 . 
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3 . The method of claim 1 wherein said mammalian 
cells are C0S7 cells. 

4 . The method of claim 1 wherein said bacterial 
cells are E. coli . 

5 5 . The expression vector ptrAP3 . 

6. The expression vector of claim 5, comprising 
the sequence of SEQ ID NO:l. 

7. The protein of SEQ ID NO : 5 . 

8. An isolated nucleic acid sequence encoding the 
10 amino acid sequence of SEQ ID NO : 5 . 

9. A vector comprising the nucleic acid sequence 
of claim 8. 

10. The vector of claim 9 wherein said vector is 
an expression vector. 



15 11. A genetically engineered host cell comprising 

the nucleic acid sequence of claim 5. 
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ptrAP3 




FIG- 1 
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aagcttggctgtggaatgtgtgtcagttagggtgtggaaagtccccaggctccccagcaggcagaagtatgc 

aaagcatgcatctcaattagtcagcaaccaggtgtggaaagtccccaggctccccagcaggcagaagtatgc 

aaagcatgcatctcaattagtcagcaaccatagtccbgcccctaactccgcccatcccgcccctaactctgc 

ccagttccgcccattctccgccccAtggctgactaattttttttatttatgcagaggccgaggccgcctcgg 

cctctgagctattccagaagtagtgaggaggcttttttggagck:ctaggctttt 

cgaggggctcgcatctctccttcacgcgcccgccgccctacctgaggccgccatccacgccggttgagtcgc 

gttctgccgcctcccgcctgtggtgcctcctgaactgcgtccgccgtctaggtaagtttaaagctcaggtcg 

agaccgggcctttctccggcgctcccttggagcctaccta^ 

CTGCTTGCTCAACTCTACGTCTrrc 
AGAAAGTTAACTGGTAAGTTTAGTCTTTTTGTC 

ATCTAAGAACTGCTCCTCAGTGAGTGTTGCCTTTACTTCTAGGCCTGTACGGAAGTC 

AAGCTGCGqAATTCOCXCCXCCQ'PXqTTTTTXCflgggQflTgA.QCqCTCCACCCgglggTlgA 
AQCQCqTOTATqATqAQOTqTACqQCOACOAgflAggTQgMOAOCAflQCCAACOAflCflgg'P 
COqQQAqTTTqCCTACOOAAAqCqqCATAAqqACATQTTqqCflTTqCCQCTQaACQAflflflg 

AACCCAACACCTAqgCTAXxgcccQTgxgxgTqgxqcxqqTqcTqcccxcqgTTqgxggflT 
CCqAAgAAAAgCgCggCCTAAA^CgCaAgTCTgqTQACTTggCACCCACCqTqCAgCTgAT 

QGTACCgAAQgQggAQCQAgTQQAAQATQTgTT*QQAAAAAATQACCQTQQAQCCTQQQgTQ 

■QAgceeoAOQ-pggagaTeaggcaggAATeAAOgAooTaocAggoooAC'PQQOcorrocAOACgo 

TOqACQTTgAOATACggAggAggAOTAagAgTAOTATTOggAgTOggACAOAaOOgATQaA 

aXCXCKA.A.CavCCCCaaTTaCCTXaCTCaxaATCATCCCAGTTGAGGAGGAGAACCCGGACTTCm 
GAACCGCGAGGCJAGCCJGAGGCCCTGGGTGCCGCCAAGAAGCTGCAGCCTG^ACAGACAGCCGfTCAAGAACC'r 
CA TCA TCTTCfTTrWidGA TGGGA TGGGGGTGTCTA CC1GTGA (7 A GCTGCCA GGA TCCTAAAA GGtVA fiA An A A 
GGACAAAC'IVGGGCCTGAGATACCCCTGGCCATGGACCGCTTCCCATATGT^XCTCTGTCCAAnArATAhA^ 
TGTAGACAAACATYZTGCHAGACAGTGGAGCCACAGCCACGGCCTACCTVSTGCGGGGTCAAGtSGCAACT'Fr'f'A 
CCA "FTGGCTTGA GTGC.A GCCGCCCGCTTTAA CCA GTGCA A CA CGA CA CGCGGCAA CGA GfZTCA TCTCCGT 
GATGAATCGeWCAAGAAAGCAGGGAAGTCAGTGGGAGTCMAACCACCACACGAtt 

AGCCGGCAecrArnrrnArArrsvnsAArrr^AArrMVAnrrmAeeccGA 

GGA GGGGTGCCA GGA CA TCGCTA CGCA GCTCA TCTCCA AC A 1YJGA CA TTGACfiTGA TCCPA tVSmnA fWCn 



KIG. 2 
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J\flAG TA CA TV? TTTCGCA TGGGA A CCCCA G A CCCTG A G TA CCCA GA r T y G A C TA CA GCCAAGGT^GGGA CCA GGCT 
GGA CGGGAAGA A TCTGGTGCAGGAA TGG CTGnCGAAGCGCCAGGGTGCCCGGTA TGTGTGGAACCGCACTGA 

arTrATGCAttcrrcccTttArccaTC TnraAcccArr^ 

CGA GA TCCACCGAGACTCCA CA CTGGAC CCCTCCCTGA TGGAGATGACAGAGGCTGCCCTGCGCCTGCTGAG 
CA GGAA CCCCCGCGGCTTCTTCCTCTTCGTGGA GG GTGGTCGCA TCGACCATGGTCA TCA TGA AA GCAGGGG 
TTA CCGGGCA CTGA CTGA GACGA TCA TGTTCGA CGA CGCCA TTGAGAGGGCGGGCCA GCTCA CCA GCGAGGA 
GGA CA CGCTGAGCCTCGTCACTGCCGA CCA CTCCCACGTCTTCTCCTTCGGAGGCTACCCCCTGCGAGGGAG 
CTCCA TCTTCGGGCTGGCCCCTGGCA A GGCCCGGGA CA GGAAGGCCTA CA CGGTCCTCCTA TA CGGAAACGG 
TCCAGGCTATGTGCTCAAGGACGGCGCCCGGCCGGATGTTACC GAGAGCGAGAGCGGGAGCCCCGAGTATCG 
GCAGCAGTCAGCAGTGCCCCTGGA CGA AG AG A CCCA CGCAGGCGAGGACGTGGCGGTGTTCGCGCGCGGCCC 
GCA GGCGCA CCTGGTTCA CGGCGTGCA GGAGCA GAC CTTCA TAGCGCA CGTCA TGGCCTTCGCCGCCTGCCT 
gGAGCCCTACACCGCCTGCGACCTGGCGCCCCCCGC CMCACCACCGACG^ 

TCTAGAGAAAAAACCTCCCACACCTCCCCCTGAACCTGAAACATAAAATGAATGCAATTGTTGTTGTTAACT 

TGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTT 

CACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGGATCCCCGGGTACCGAG 

CTCGAATTAATTCCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGG 

TATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAG 

CAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCC 

CTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGG 

CGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCT 

TTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTC 

GCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTC 

TTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGA 

GGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTG 

GTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCA 

CCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATC 

CTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGAT 

TATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATG 

AGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTT 

CATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTG 

CTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGG 

CCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAG 

TAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGT 

CGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCA 

AAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGG 



FIG. 2 
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TTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT 
CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATA 
CCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTC 

TCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTT 
TCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGA 
AATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCG 
GATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCAC 
CTGC Cs£t2^ Xy^^T 

1^ 
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FIG. 3 

MLLLLLLLGLRLOLSLG I I PVEEENPDFWNREAAEALGAAKKLQPAQTAAKNLI 
IFLGDGMGVSTVTAARILKGQKKDKLGPEIPLAMDRFPYVALSKTYNVDKHVPD 
SGATATAYLCGVKGNFQTIGLSAAARFNQCNTTRGNEVISVMNRAKKAGKSVGV 
VTTTRVQHASPAGTYAHTVNRNWYSDADVPASARQEGCQDIATQLISNMDIDVI 
LGGGRKYMFRMGTPDPEYPDDYSQGGTRLDGKNLVQEWLAKRQGARYVWNRTEL 
MQAS LDP S VTHLMGLFE PGDMKYE I HRDSTLD P SLMEMTEAALRLL SRNPRGFF 
LFVEGGRIDHGHHESRAYRALTETIMFDDAIERAGQLTSEEDTLSLVTADHSHV 
FSFGGYPLRGSSIFGLAPGKARDRKAYTVLLYGNGPGYVLKDGARPDVTESESG 
SPEYRQQSAVPLDEETHAGEDVAVFARGPQAHLVHGVQEQTFIAHVMAFAACLE 
PVTRrniAPPAflTTDAAHPG RSyVPALLPLLAGTLLLLETATAP 

FIG. 4 

1 1 PVEEENPDFWNREAAE ALGAAKKLQPAQTAAKNL 1 1 FLGDGMGVSTVT AARI 

LKGQKKDKLGPEIPLAMDRFPYVALSKTYNVDKHVPDSGATATAYLCGVKGNFQ 

T I GL S AAARFNQCNTTRGNEVI S VMNRAKKAGKS VGWTTTRVQHAS PAGTYAH 

TVNRNWYSDADVPASARQEGCQDIATQLISNMDIDVILGGGRKYMFRMGTPDPE 

YPDDYSQGGTRLDGKNLVQEWLAKRQGARYVWNRTELMQASLDPSVTHLMGLFE' 

PGDMKYEIHRDSTLDPSLMEMTEAALRLLSRNPRGFFLFVEGGRIDHGHHESRA 

YRALTETIMFDDAIERAGQLTSEEDTLSLVTADHSHVFSFGGYPLRGSSIFGLA 

PGKARDRKAYTVLLYGNGPGYVLKDGARPDVTESESGSPEYRQQSAVPLDEETH 

AGEDVAVFARGPQAHLVHGVQEQTFIAHVMAFAACLEPYTACDLAPPAGTTDAA 

HPG (ss? iOo:^) 
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ZCGGCCACAGGACCACCTCCCCGGAG *>9 



MWLVTFLL — Ij DSLHK 15 

AATAQGGC ClSJl ' i ' rA TgGC ATG TOG CTG GTA ACT TTC CTC CTG CTC CTC GAC TCT TTA CAC AAA 143 

Aa pBDVGTSLYFVNDSLQQV 35 

GCC CGC CCT GAA GAT GTT GGC ACC AGC CTC TAC TTT OTA AAT GAC TCC TTG CAG CAC OTG 203 

TFSSSVGVVVPCPAAGSPSA 55 

ACC TTT TCC AGC TCC GTG GGG OTG GTG GTG CCC TGC CCG GCC GCG GGC TCC CCC AGC GCG 263 

ALRWYLATGDDlYDVPHrRH 75 

GCC CTT CGA TGG TAC CTG GCC ACA GGG GAC GAC ATC TAC GAC GTG CCG CAC ATC COG CAC 323 

VHANGTLQLYPFSPSAFNSF 95 

GTC CAC GCC AAC GGG ACC CTG CAG CTC TAC CCC TTC TCC CCC TCC GCC TTC AAT AGC TTT 383 

XHDNDYFCTAENAA3KIRSP 115 

ATC CAC GAC AAT GAC TAC TTC TGC ACC GCG GAG AAC GCT GCC GGC AAG ATC CGG AGC CCC 443 

NIRVKAVFREPYTVRVSDQR 125 

AAC ATC CGC GCC AAA GCA GTT TTC AGG GAA CCC TAC ACC GTC CGG GTG GAG GAT CAA AGG 5 33 

SMRGNVAVTKCLIPSSVQEY 155 

TCA ATG CGT GGC AAC GTG GCC GTC TTC AAG TGC CTC ATC CCC TCT TCA GTG CAG GAA TAT 563 

VS VVS WSK3TV 5 I Z ? ENRF F 175 

GTT AGC GTT OTA TCT TGG GAG AAA GAC ACA GTC TCC ATC ATC CCA GAA AAC AGG TTT TTT 623 

iTYHGGtiYISDVQXEDAL S T i95 

ATT ACC TAC CAC GGC GGG CTG TAC ATC TCT GAC GTA CAG AAG GAG GAC GCC CTC TCC ACC 683 

YRC ITKHKYSGSTRQSNGAR 215 

TAT CGC TGC ATC ACC AAG CAC AAG TAT AGC GGG GAG ACC CGG CAG AGC AAT GGG GCA CGC 7 43 

LSVTDFA2S I P T I LDGFKSQ 235 

CTC TCT GTG ACA GAC CCT GCT GAG TCG ATC CCC ACC ATC CTG GAT GGC TTC CAC TCC CAG 803 

EVWAGKTVSLPCTASGYPI ? 255 

GAA GTG TGG GCC GGC CAC ACC GTG GAG CTG CCC TGC ACC GCC TCG GGC TAC CCT ATC CCC 963 

AIR-WLKDGRPLPADSRWTKR 275 

GCC ATC CGC TGG CTC AAG GAT GGC CGG CCC CTC CCG GCT GAC AGC CGC TGG ACC AAG CGC 92 3 

I T G — T JSDLRTE — SGTYIC E 2BS 

ATC ACA GGG CTG ACC ATC AGC GAC TTG CGG ACC GAG GAC AGC GGC ACC TAC ATT TGT GAG 383 

VTNTFGSAEATGILMVIDP1 315 

GTC ACC AAC ACC TTC GOT TCG GCA GAG GCC ACA GGC ATC CTC ATG GTC ATT GAT CCC CTT 1343 

KVTLT PKKL KTGI G S TVI L S 335 

CAT GTG ACC CTG ACA CCA AAG AAG CTG AAG ACC GGC ATT GGC AGC ACC GTC ATC CTC TCC 11C3 

C A Z* T G S PEFT IBW YRNTSLV 355 

TGT GCC CTG ACG GGC TCC CCA GAG TTC ACC ATC CGC TCG TAT CGC AAC ACG GAG CTG GTG 1163 

LPDEAlSlRGLSNETLLrTS 375 

CTG CCT GAC GAG GCC ATC TCC ATC CGT GGG CTC AGC AAC GAG ACG CTG CTC ATC ACC TCG 1223 

AQKSHSGA'YQCrATRK AQTA 395 

GCC CAG AAG AGC CAT TCC GOG GCC TAC CAG TGC TTC GCT ACC CGC AAG GCC CAG ACC GCC L2l33 
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QDFAIIAZ.EDGTPRIVSSFS 415 

CAG GAC TTT GCC ATC ATT GCA CTT GAG GAT GGC ACG CCC CGC ATC GTC TCG TCC TTC AGC 13 43 

EKVVNPGEQFSLMCAAKGAP 43S 

GAG AAG GTG GTC AAC CCC GGC GAG CAG TTC TCA CTG ATG TGT GCG GCC AAG GGC GCC CC3 1403 

pp-nvTWALDDEFlV^DGSHR 455 

CCC CCC ACG GTC . ACC TGG GCC CTC GAC GAT GAG CCC ATC GTG CGG GAT GGC AGC CAC CGC 1463 

TNQYTMSDGT)^ 465 

ACC AAC CAG TAC ACC ATG TCG GAC GGC ACC ' th W-y) 1493 

T&k MAW E) 



FIG. 5 



WO 98/22491 



8/9 



PCT/US97/20201 



8 £2 6 -MWLVTF LLLLDSIiHKARPED - VCTSLYFVKMLQQVTFSSS 

D3 8 492 --MCTTLLVSra-LLISLTSCWirrWHKR^ 

P2 02 4XEURO MWRQSTXUUaiLVALUJUafiXSSKGMILPPlUTX QPAPGELLFJCVAQQNKEJD 

P32004EORA fcTATALHYVWPLIMSPCLLXQIFEEYEGHHVm PFVXTEQflFR-RLWFPTD 

P3533XG-CA -MMXEXSISASKASLVFFMQMISAIJWFUDSXLI^^ 

Q02 2 4 6XONI -MOTATRAXPHLtLVAAVALVSSSAWSSALQSQTT -FQFVFEDQPLSVLFPBESTE 

O11031 --MLSWKQLXLLSFXGClAGELLIi--- Q --OFV7VKEFSNSXFPVGSXD 

X65224 MVIiHSHQLTYAGIAFAI^LiHHLISAIEVFLDSNIQSEIjP-QPPTIT^ 



0 
1 



8 £2 6 VGVVVPCPAAOSPSAALRWyiiATGDDIYrVPHISUSVBAKG««TLQLYPP5PSAFNSFIHD 

D3 8 492 GICVSl^CKJLBASPFPVYKWRKN-NaDVDLTN-DRYSMV OQNLVXNNTDKQK-D — A 

P2024XEURO NFFXXECEAIXSQPEPEYSirXXN-aiaeFDWQAYDHRlfl^Q R 

P32004EURA D - X SLKCEASGKFEVQPXirrW-GVHFKFKBXLGVTVYQS PHSGSFTXTGNNSN FAQRFQ 

P35331C-CA N-XVXQCSAXGKPPP5F5WTRN*GTKFDXDKDAQVTKIUrN — SOTLVVNIMNOVXAZAYR 

Q02246XONI BQVLLAClUUUlSPPATYRMCMN-GTEKiaBPaSRHQLV GONXiVXMNPTKAQ-D— A 

U11031 KXXTLMCBARGMPSPHYRWQLN-GSDIiyrSLDHRYKLN GONLIVXNFNRNW-D — T 

X65224 N-IFIECEAXGNPVPTFSWTRN-GKFFNrVAXDPKVS«RR — SGTLVXDFHGGGRPDDYE 

♦ * * * 

8 £2 6 NDYFCFAENAAGXXRS PNI RVKAVFRE P YTVRVEDQRSMR-GNVAVFKCt*IP SSVQEYVS 

D3 8492 GXYYCIASNNYGMVRSTEATLSFGYLDPPPPEDRPEV1CVKZGKGKVLLCDPPYHFFDD-L 

P2 0 2 4 1 EURO GKYQCFA5NXFGTATSNSVYVRXA£LNAFXDSAAXTL£AVEG£PFMLXCAAPDGFPS — P 

P 3 2 0 0 4 SURA GXTRCFASNXLGTAMSH£XRLMASGAPK1fPKXTVKPVTVYEaCSVVLPCNPPPSAEP — L 

P3 5 3 3 XG-CA GVTQCTARNERaAAISNNlVIRPSRSFLtfTXSXLEPinrWEa P 

Q0 2 2 4 6XONI CVTQCLASNPVOTWSREAX LRFGFLQEf 8 KEERDFVKAHEawaVMLPCWPPAHYPG- - L 

U1103 1 GSTQCFATNS LGTX VSREAKLQ F AYLSNF KSRMKSRVSVREGQGWLliCGP P PHSGE — L 

X65224 GETQCFARNDYGTAI-SSXrHl^VSRSPLWPKBKVDVXEVDEaAPI-SLQCNPPPGLPP — P 

3f26 WSWIXDTVSIIPE NR--FFXTYHGGI#YXSDVQKJBD — ALSTYRCITXHKYSGBT 

D3 8492 SYKKLLNBFPVriTM DKRRrVSQ-TOGNLYXAMVESSD RONT«CFVCS — PSIT 

F20241EURO TVNl«XQESII«SXKSXNNSR--MTU3PEONLWFSNVraEDAS^ 

P3 2 0 0 4EURA RI YWHWSKILHIKQ OCR - - VTMOQNGNLYFANVLTSDN — H 5DTXCHAH7 PQTRT I 

P3 5331G-CA I XFttMDNAFQRXiFQ SER--VSQGLMGDLYFSNVQPEDT — RVDTICYARFNHTQTI 

Q02246XONI SYKHIXNEFPNFIFT DORHFVSQ-TTGNLYXARTNASD U3MTSCIATSHMDFST 

UX 1 0 3 1 S YAWFNEY? SFVEE DSRRF VSQ - ETGHLYX AKVEP SD VOmTTCWTS — TVTN 

Xfi S 2 2 4 VXFWWSSSMEPIKQ DKR- - VSQCQNGDLYFSNVMLQDA — QTDYSCNARFHFTHTI 

• * * * * * * 

8 f 26 RQSNOARJUSVTDPAES IPTILDCFKSQEV WAGHTVEL 

D3B492 KSVFSKFXPLIPXPERTT KPYPADIWQFKDXY—TMMGQNVTL 

P2 02 4XEURO XIONKVLXJDfVKQMGVSASQ NKHP PVRQYVSRRQS- LALRGKRMEI- 

P 3 2 0 0 4EDRA X QKEP IDLRVKATNSM XD RXPRIXFPTNSSSKLVALQGQPLVL 

P3S33XG-CA QQKQP 1 5VKVF STK? VTERPPVLLTPMGSTSNKVELRGNVl*tX 

Q02 2 4 6XONI KSVFSKFAQLNLAAEDTR liFAPSXRARFPAETY - -ALVGQQVTL 

UX1031 ARVMSPTPLVLRSDGVMG EYEPXIELQFPETLP— AAKGSTVKL 

X6 S 2 2 4 QQKNFYTtiCNnCTKKPHOTTSIJINOTDMYSA S SQMVLRGVDtXL 

* * * 

8f2€ FCTASGYFIPAXRWLKDGRF — LPADSRWTKRITGLTISDLiRTEDSaTTICEVTNTFaSA 

D38492 ECFALGN P VPD X RWRKVL2 P - - MPTT AEI STSGA VLK X FN X QLEDXGtiYECEAENIRGKD 

P202 41 EURO FCXYGGTFLPQTVWSKDGQRXQWSDRXTQGHYGKSLVXRQTtfTODAGTTrC^ 

P3 2 0 0 4EURA ECXAEGFPTPTXKIftRPSGPM- PADRVTYQNHNKTLQLLKVGEEDTCEYRCLAENSLGSA 

P3 S 3 3 1G-CA ECXAAGLPTPVIRHXKEGGEL- PAKfRTFFENFXKTLKI IDVSEADSGNYKCTARNTLGST 
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QO 2 2 4 6 XONI ECF AFGNFVPRIXWWCVDG SLSPQWTTAEPTLQ I PSVSFEDEOTTECSAENSKGRD 

U11031 BCFAXjCBIPVPQXiroUlSDaMP-FPTKXKI^^ 

X6 5 2 2 4 ECXASOVPAFD mWYKKOQEL - PAGXTXXiENFNKALRI SNVSEECSGETFCLASWXMaS X 

8C26 E-ATGILMVIDPI^HVTLTPKKIdCrGXGSTVTLSCAl#TC 

D3S492 K-HOAJttWQAFPEWN^INirrETOXGSDLYWFCVATO^ 

P20241EURO QSr3IXLNVNSVPYFTKEPEIATAAJ^EI^^EaUAX3^ 

P3 2 0 0 4 CUBA R-HAYWTVEAAPWLHXPQSHLYGFQETAIILDCQVQ 

P35331G-CA H-H\aSVTVlUUlPyWXTAPRKIiVL5PaEXX7rLXaiA2K»P 

QQ2246XONI T-VQGRXXVQAQPEWLK\TCSiyrEADiaSNLRWGC^^ - 

U11031 V- ARGRLTYYAKFYWVQL LXDVETAVEDS L YWWRASGKP1CP S YKWLKKODALVT*£ER — 

X6S224 R -HT X SVRVXAAF YWLDE PQNXiX LAPGEDGRLVCRANGNFKP S X QWL.VNQEP X SCS P FN P 

* * * * * 

Bf26 — — fi LVLFDEAISIRGLSN 

D3B492 -YAYHKQELRLYDVTFENAGMYQCIAENAYGTIYANAEIiKIlAXAPTFI^ 

P2 02 41EURO RRTVTDNTIRX X^VKODTCNYOCNATMSLCSYVVrafVYLNVOAEPP — TISEAPAAVSTV 

P32004EURA KYRIQRGALX LSlAfQP SDlT4VTQC£ARNRKCLXiIJkMAYIYVVQLPA-KXLTADNC^rYKAV 

P3 5331G-CA SRKVDGDT X X F S AVQERS S A^^QCNASNEYGYIJ^JU»IArVNVT-A£PP - RILTPANKLYQVT 

Q0224 6XONI - VEVLAGDLRF SXL S LEDSGMYQCVAENKHGT IYASAZLAVQAXAPDFRLNPVRRL X PAA 

U11031 -IQIENGALTIANIjNV3DSaMFQCIAEMKHaLIYSSA£LKVt^SAPDFSRNPlQCXMIQVQ 

X6522 4 SREVAGDTIVTRDTQIGSSAVYQCNASNEHGYLLiANAFVSV 

8f26 ETLL ITSAQKSHSGATQCFA 

03 S 492 KGGRVXIECKPKAAPKFKFSWSKGTEWLVNSSRILXWED-GSLEXNNXT^^ 

P202 41EURO IX3RNVTIKCRVNOSPKPLVKWLRASNWLT — GGXYNVQANGDLEXQDVTFSDA0KXTCYA 

P3 2 00 4EURA QaSTAYLl^KATGAPVPSVQWIiDETCTT^QOERFFPYANGTLGiroLOANDTORTFC^ 

P3S331G-CA ADS PALXDCAYFG S FKPEI EWFRGVKGS X LRGNE YVFKDNGTLtE X PVAQroSTGTTfTCVA 

Q02246XONX ROGIXLIPCQPRAAPKAVVLWSKGTEILVNSSRVTVTPD-GTLXIRHISRSDEGICrrCPA 

UX103 1 VGSLVXLDCKPSASPRALSFVnUCGDTVVREQARXSLLtJD-GGIJCXKNVTlCADAGXTTCXA 

X6 5 22 4 QYKRTRIJ3CPFFGSPIPTLRWFKNGQGNMLDGGNYXAXENGSLEMSMMIKEDQGXTTCVA 

• * m m * 

3 £ 2 6 TRJUVQTAQDFAI I ALEDGTPR IV5 SFSEKWNPGEQFgLMCAAJCCAP — PFTVTKALDDE 

D3 8492 ENNRGXANSTGTLVITOPT-RXILAPINADITVGENA™QCAASrDPSIJ3LTFVWfiFNGY 

P20241EURO QNKFGEXQAIX33X*VVXEHT-RXTQEPQNYEVAAG0SATFRCNEAHDDTLEIEX0WW1CDGQ 

P3 2004EX7RA ANDQNNVTIMANLKVKDAT-QITQGPRSTIEKKGSRVTFTCQASFDPSLQPSXT1IRGDGR 

P3 533 1G-CA RNXLGKTQNEVOLFUTODPT-KIXKQPQYXVIQRSAQAS — 

Q02246 XONI ENFMGKANSTGILSVRDAT-K ITLAPS SADINLGDNLTLQCKASHDPTMDLTFTWTLDDF 

U11031 aNQFGXANGTTQLVVTEPT-RlIIJVPSNMDVAVaESIILPCQVQHDPLLDIMFAWYTNGT 

X65224 TNIIjGKVEAQVRIjEVKDPT - RIVRGPEDQWKRGSMPRLHCRViaiDPTLKLTVTWLKD - - 



8f 26 PXVRDGSHRTNQYTMS * **>:\ 

D3 8492 VIDFNKEITNIHYQRNFMLDAKGELLIRNAQLKHAGRTTCTAQTIVDNSSASADLVVRaP C 

P20241EURO SIDFEAQPR FVKTNDN - - SLTI AKTKELDSGETTCVARTRX*DEATARANLXVQDV C 

P 3 2 0 0 4 EURA - - DLQELGD SDKYF I EDG - - RLVI H SIiDYSDQGNYSCVASTELDWESRAQLL WGS C 

P3 5 3 3 1G-CA - -NNELPDD ERFLVGXD* -NLriMNVTDKDDGTTTCIVNTTLDSVSASAVLTWAA C_ 

Q02246XONI PXDFDKPGG — HYRRT^A^CETIGDLTXI^OLRHOaKr^C^QT^AmSASKEATVl.VRaP Q 

U11031 LroFKKMS--HFElC\^SSS-GDl«IRKIOLKHSGK C 

X65224 --DAPLYIG MRMKKEDD- -GLTIYGVAEXDQGDYTCVASTEt#DKDSAXAYLTVI*AI C 
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